-
-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add getData function for DMatrix #9379
base: master
Are you sure you want to change the base?
Conversation
b170d57
to
caa4d9f
Compare
16e5ee7
to
dcb94ac
Compare
cc @wbo4958 . |
I will check it today. sorry for late |
|
||
for (int row = 0; row < rowNum; row++) { | ||
for(int col = 0; col < colNum; col++){ | ||
denseMatrix.set(row, col, 0.0f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to set the default to 0.0f when creating BigDenseMatrix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the late response. No such functions provided by BigDenseMatrix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just return the CSR instead? Converting a CSR to a dense can be extremely memory hungry when input is sparse. Some datasets can span a few thousands of features with only a small portion of valid values due to encoding.
int[] featureIndex = new int[nonMissingNum]; | ||
float[] featureValue = new float[nonMissingNum]; | ||
|
||
XGBoostJNI.checkCall(XGBoostJNI.XGDMatrixGetDataAsCSR(handle, "{}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trivialfis, is there an API to get the dense matrix directly instead of converting CSR data to the dense dmatrix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After checking the source code, I didn't find such functions. But @trivialfis can confirm it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, DMatrix itself is a CSR matrix.
BigDenseMatrix denseMatrix = new BigDenseMatrix(rowNum, colNum); | ||
|
||
for (int row = 0; row < rowNum; row++) { | ||
for(int col = 0; col < colNum; col++){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for(int col = 0; col < colNum; col++) {
some minor comments, Overall, LGTM. |
dcb94ac
to
3ebf9f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than converting to a dense matrix, the PR looks good to me.
The python API has a get_data function, add a similar api for jvm-package too.