-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-39398: [C++][Parquet] DNM: benchmark for readLevels #39486
Conversation
Please not merge this patch, this is just for benchmark |
for (auto _ : state) { | ||
state.PauseTiming(); | ||
Int32Reader* reader = helper.ResetColumnReader(); | ||
[[maybe_unused]] bool v = reader->HasNext(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using hasNext to trigger initialization
const auto repetition = static_cast<Repetition::type>(state.range(0)); | ||
const auto batch_size = static_cast<int64_t>(state.range(1)); | ||
|
||
BenchmarkHelper helper(repetition, /*num_pages=*/1, /*levels_per_page=*/16 * 80000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using one page to make it simple
@@ -300,6 +300,9 @@ class TypedColumnReader : public ColumnReader { | |||
int16_t* rep_levels, int32_t* indices, | |||
int64_t* indices_read, const T** dict, | |||
int32_t* dict_len) = 0; | |||
|
|||
virtual void ReadLevels(int64_t batch_size, int16_t* def_levels, int16_t* rep_levels, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about making this private with a Test Peer that can be used in the benchmark, so we can check this PR in?
Thank you @mapleFU it would be great to check this in (without the count changes) so we have a good benchmark for ReadLevels. See comment on how to maybe do it without breaking abstractions. |
I guess looking at CI it would take a little more work than this proof of concept to check it in. |
It's just a quick poc for ReadLevels optimization... I think exporting it is so hacking, because |
@ursabot please benchmark |
Benchmark runs are scheduled for commit 0529cce. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete. |
Thanks for your patience. Conbench analyzed the 6 benchmarking runs that have been run so far on PR commit 0529cce. There were 3 benchmark results indicating a performance regression:
The full Conbench report has more details. |
Emm would regression benchmark un-related..? |
None of them are related to Parquet. |
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?