Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORC-1658: [C++] Unify identifiers naming style. #1855

Closed
wants to merge 2 commits into from

Conversation

ffacs
Copy link
Contributor

@ffacs ffacs commented Mar 18, 2024

What changes were proposed in this pull request?

  1. Add run_clang_tidy.py which comes from LLVM and clang-tidy configuration file.
  2. Use clang-tidy to format the C++ code.

Why are the changes needed?

To close #1845

How was this patch tested?

UT passed

Was this patch authored or co-authored using generative AI tooling?

NO

1. cd builf && python3 ../run_clang_tidy.py -p . -fix
2. git ls-files --exclude-standard | grep ".*\(cc\|hh\)$" | xargs clang-format -i -style=file --dump-config
3. add lisence to run_clang_tidy.py and clang-tidy config file
@ffacs
Copy link
Contributor Author

ffacs commented Mar 20, 2024

This patch I`ve done the following things:

  1. run clang-tidy cd builf && python3 ../run_clang_tidy.py -p . -fix
  2. format codes
git ls-files --exclude-standard | grep ".*\(cc\|hh\)$" | xargs clang-format -i -style=file --dump-config 
  1. add licence to run_clang_tidy.py and clang-tidy config file

Currently the patch can be built on my macOS(arm Apple clang version 15.0.0) and ubuntu(x64 clang-18) and passes UTs with mkdir build -p && cd build && cmake .. -DBUILD_JAVA=OFF -DCMAKE_BUILD_TYPE=DEBUG -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang -DCMAKE_EXPORT_COMPILE_COMMANDS=1 && make -j20 && make test .

Copy link
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work! I've left some comments. Most are about the requests to remove m prefix. This can be a separate patch.

int32_t mScale; // scale of decimal type
bool mIsNull; // whether this literal is null
size_t mHashCode; // precomputed hash code for the literal
LiteralVal mValue_; // data value for this literal if not null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to remove the m prefix?

@@ -90,7 +90,7 @@ namespace orc {
bool operator==(const BitSet& other) const;

private:
std::vector<uint64_t> mData;
std::vector<uint64_t> mData_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, would be good to remove m prefix here and below.

std::vector<TreeNode> mChildren;
size_t mLeaf;
TruthValue mConstant;
Operator mOperator_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

uint64_t mColumnId;
std::vector<Literal> mLiterals;
size_t mHashCode;
Operator mOperator_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

const SchemaEvolution* mSchemaEvolution;
uint64_t mRowIndexStride;
WriterVersion mWriterVersion;
const Type& mType_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -66,8 +66,8 @@ namespace orc {
std::string toString() const override;

private:
std::shared_ptr<ExpressionTree> mExpressionTree;
std::vector<PredicateLeaf> mLeaves;
std::shared_ptr<ExpressionTree> mExpressionTree_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

sys.exit(return_code)

if __name__ == '__main__':
main()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to run this in the CI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to run this in the CI?

Absolutely. We can leverage clang-tidy/tool/clang-tidy-diff.py

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can leverage Github Action like https://github.com/marketplace/actions/c-c-linter or similar actions from here: https://github.com/marketplace?category=&type=actions&verification=&query=sort%3Apopularity-desc+clang-tidy

I have added two actions in the past:

It seems that we can use a single linter action to run both clang-tidy and clang-format.

explicit NotImplementedYet(const std::string& what_arg);
explicit NotImplementedYet(const char* what_arg);
explicit NotImplementedYet(const std::string& whatArg);
explicit NotImplementedYet(const char* whatArg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we consider using snake case instead of mixed case (camel case) for variable names? Impala C++ Code follows a modified version of the Google C++ Style Guide at:
https://google.github.io/styleguide/cppguide.html#Variable_Names

I think either way is ok. Just remembered that when I first read the ORC C++ code, they look like Java code. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are lots of public/protected variables using camel case, if we change them into snake case, the library codes would be not backwards compatible. Backwards compatibility is important, I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for keeping the compatibility.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. That makes sense to me.

@wgtmac wgtmac changed the title ORC-1658: [C++] uniform identifiers naming style. ORC-1658: [C++] Unify identifiers naming style. Mar 27, 2024
@ffacs
Copy link
Contributor Author

ffacs commented Mar 28, 2024

@dongjoon-hyun Hi, do you have any suggestions about this patch?

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun Hi, do you have any suggestions about this patch?

Although I've been monitoring this PR and discussion, there is no other objection for the AS-IS direction including backward compatibility consideration.

Copy link
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine to merge this PR as is. We can add follow-ups to address remaining issues as this one is already very large.

@wgtmac wgtmac closed this in 03ce308 Apr 2, 2024
@dongjoon-hyun
Copy link
Member

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++] uniform identifiers naming style.
4 participants