-
Notifications
You must be signed in to change notification settings - Fork 485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ORC-1580: Change default DataBuffer constructor to use reserve instead of resize #1739
Conversation
c++/include/orc/MemoryPool.hh
Outdated
@@ -51,6 +51,7 @@ namespace orc { | |||
|
|||
public: | |||
DataBuffer(MemoryPool& pool, uint64_t _size = 0); | |||
DataBuffer(MemoryPool& pool, uint64_t _size, bool noMemSet); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When do we use noMemSet=false
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never use. The problem is that it is not possible to provide explicit template <bool noMemSet>
to a constructor. Perhaps we can allow this branch in constructor? That makes the code better but introduce a branch that should be resolved at compile time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the new commit I tried to use an empty class tag type to differentiate the constructor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. There are some other places that we can replace resize() with reserve(), e.g. in BlockDecompressionStream::NextDecompress()
Lines 762 to 765 in ffb0c87
if (inputDataBuffer.capacity() < remainingLength) { | |
inputDataBuffer.resize(remainingLength); | |
} | |
::memcpy(inputDataBuffer.data(), inputBuffer, availableSize); |
We can further optimize them in follow-up tasks.
+1, LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@XinyuZeng , could you revise the PR title and description? For example, PR title seems to become inconsistent from the AS-IS code.
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @XinyuZeng and all!
…d of resize ### What changes were proposed in this pull request? See #1430 - default DataBuffer constructor now uses `reserve`. I also add a `zeroOut` function to manually memset the buffer. For example, `MapColumnWrite::add` may use uninitialized `offsets` value (at `uint64_t elemOffset = static_cast<uint64_t>(offsets[0]);`). i.e., not memset the `offsets` buffer will result in segfault because the code implicitly assume the value should be 0. Therefore I still `zeroOut` the buffer for Map, List and Union, even though List and Union pass the unit tests without memset. ### Why are the changes needed? Reduce overhead in `ColumnVectorBatch` construction and compression/decompression buffer construction ### How was this patch tested? Correctness via `make test-out`. I further use `orc-scan` utility, setting `batchSize` to 1 million and then scan a file with 1 million rows and 20 columns. After this PR, the time reduces from 2.9s to 2.2s. The compression/decompression buffer overhead reported in #1430 can be verified by the reporter if necessary. ### Was this patch authored or co-authored using generative AI tooling? No Closes #1739 from XinyuZeng/gh-1430. Authored-by: Xinyu Zeng <30414224+XinyuZeng@users.noreply.github.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 790a3e5) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Merged to main/2.0 for Apache ORC 2.0.0. |
What changes were proposed in this pull request?
See #1430 - default DataBuffer constructor now uses
reserve
.I also add a
zeroOut
function to manually memset the buffer. For example,MapColumnWrite::add
may use uninitializedoffsets
value (atuint64_t elemOffset = static_cast<uint64_t>(offsets[0]);
). i.e., not memset theoffsets
buffer will result in segfault because the code implicitly assume the value should be 0. Therefore I stillzeroOut
the buffer for Map, List and Union, even though List and Union pass the unit tests without memset.Why are the changes needed?
Reduce overhead in
ColumnVectorBatch
construction and compression/decompression buffer constructionHow was this patch tested?
Correctness via
make test-out
. I further useorc-scan
utility, settingbatchSize
to 1 million and then scan a file with 1 million rows and 20 columns. After this PR, the time reduces from 2.9s to 2.2s. The compression/decompression buffer overhead reported in #1430 can be verified by the reporter if necessary.Was this patch authored or co-authored using generative AI tooling?
No