forked from NVIDIA/spark-rapids
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge branch-24.04 into main #30
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
Signed-off-by: Tim Liu <timl@nvidia.com>
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
To fix: #10256 Bump up dependency version to 24.04.0-SNAPSHOT Signed-off-by: Tim Liu <timl@nvidia.com>
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
Signed-off-by: Jason Lowe <jlowe@nvidia.com>
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
* remove leading space for json path in GetJsonObject Signed-off-by: Haoyang Li <haoyangl@nvidia.com> * Update comments Signed-off-by: Haoyang Li <haoyangl@nvidia.com> * Use JsonPathParser to normalize path Signed-off-by: Haoyang Li <haoyangl@nvidia.com> * Update compatibility doc Signed-off-by: Haoyang Li <haoyangl@nvidia.com> * clean up Signed-off-by: Haoyang Li <haoyangl@nvidia.com> * Fallback json paths containing in GetJsonObject Signed-off-by: Haoyang Li <haoyangl@nvidia.com> * cache normalizeJsonPath and prevent memory leak Signed-off-by: Haoyang Li <haoyangl@nvidia.com> * clean up Signed-off-by: Haoyang Li <haoyangl@nvidia.com> * ready to merge Signed-off-by: Haoyang Li <haoyangl@nvidia.com> * Use parser to check whether to fallback Signed-off-by: Haoyang Li <haoyangl@nvidia.com> * Add a special case Signed-off-by: Haoyang Li <haoyangl@nvidia.com> --------- Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Jim Brennan <jimb@nvidia.com>
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
* Move 351 shims into noSnapshot buildvers Move 351 shims into noSnapshot buildvers as spark has release it. Follow up of #10465 (comment) Signed-off-by: Tim Liu <timl@nvidia.com> * 351 shim for scala 2.13 Signed-off-by: Tim Liu <timl@nvidia.com> --------- Signed-off-by: Tim Liu <timl@nvidia.com>
Fixes #8208. This commit adds support for `WindowGroupLimitExec` to run on GPU. This optimization was added in Apache Spark 3.5, to reduce the number of rows that participate in shuffles, for queries that contain filters on the result of ranking functions. For example: ```sql SELECT foo, bar FROM ( SELECT foo, bar, RANK() OVER (PARTITION BY foo ORDER BY bar) AS rnk FROM mytable ) WHERE rnk < 10 ``` Such a query would require a shuffle to bring all rows in a window-group to be made available in the same task. In Spark 3.5, an optimization was added in [SPARK-37099](https://issues.apache.org/jira/browse/SPARK-37099) to take advantage of the `rnk < 10` predicate to reduce shuffle load. Specifically, since only 9 (i.e. 10-1) ranks participate in the window function, only those many rows need be shuffled into the task, per input batch. By pre-filtering rows that can't possibly satisfy the condition, the number of shuffled records can be reduced. The GPU implementation (i.e. `GpuWindowGroupLimitExec`) differs slightly from the CPU implementation, because it needs to execute on the entire input column batch. As a result, `GpuWindowGroupLimitExec` runs the rank scan on each input batch, and then filters out ranks that exceed the limit specified in the predicate (`rnk < 10`). After the shuffle, the `RANK()` is calculated again by `GpuRunningWindowExec`, to produce the final result. The current implementation addresses `RANK()` and `DENSE_RANK` window functions. Other ranking functions (like `ROW_NUMBER()`) can be added at a later date. Signed-off-by: MithunR <mythrocks@gmail.com>
This PR adds a new metric for the preprojection in GpuExand. Signed-off-by: Firestarman <firestarmanllc@gmail.com>
Signed-off-by: Jason Lowe <jlowe@nvidia.com>
…ory oom work (#10519) Signed-off-by: Jim Brennan <jimb@nvidia.com>
* Update rapids jni and private dependency version to 24.02.1 (#10511) Signed-off-by: Tim Liu <timl@nvidia.com> * Add missed shims for scala2.13 (#10465) * Add missed shims for scala2.13 Signed-off-by: Tim Liu <timl@nvidia.com> * Add 351 snapshot shim for the scala2.13 version of plugin jar Signed-off-by: Tim Liu <timl@nvidia.com> * Remove 351 snapshot shim as spark 3.5.1 has been released Signed-off-by: Tim Liu <timl@nvidia.com> * Remove scala2.13 351 snapshot shim Signed-off-by: Tim Liu <timl@nvidia.com> * Remove 351 shim's jason string Ran `mvn generate-sources -Dshimplify=true -Dshimplify.move=true -Dshimplify.remove.shim=351` to remove 351 shim's jason string, and fix some unnecessary empty lines that were introduced Signed-off-by: Tim Liu <timl@nvidia.com> * Update Copyright 2024 Auto copyright by below scripts ``` export SPARK_RAPIDS_AUTO_COPYRIGHTER=ON ./scripts/auto-copyrighter.sh $(git diff --name-only origin/branch-24.04..HEAD) ``` Signed-off-by: Tim Liu <timl@nvidia.com> * Revert "Update Copyright 2024" This reverts commit 8482847. * Revert "Remove 351 shim's jason string" This reverts commit 78d1f00. * skip 351 from strict checking * Alien scala2.13/pom.xml to scala2.12 one Run the script `bash build/make-scala-version-build-files.sh 2.13` Signed-off-by: Tim Liu <timl@nvidia.com> * pretend 351 is a snapshot in 24.02 Signed-off-by: Gera Shegalov <gera@apache.org> * pretend 351 is a SNAPSHOT version * Revert change of build/shimplify.py Signed-off-by: Tim Liu <timl@nvidia.com> --------- Signed-off-by: Tim Liu <timl@nvidia.com> Signed-off-by: Gera Shegalov <gera@apache.org> Co-authored-by: Raza Jafri <rjafri@nvidia.com> Co-authored-by: Gera Shegalov <gera@apache.org> * Update changelog for v24.02.0 release (#10525) Signed-off-by: Tim Liu <timl@nvidia.com> --------- Signed-off-by: Tim Liu <timl@nvidia.com> Signed-off-by: Gera Shegalov <gera@apache.org> Co-authored-by: Raza Jafri <rjafri@nvidia.com> Co-authored-by: Gera Shegalov <gera@apache.org>
Signed-off-by: Robert (Bobby) Evans <bobby@apache.org> Co-authored-by: Andy Grove <andygrove@nvidia.com>
Signed-off-by: Jason Lowe <jlowe@nvidia.com>
Signed-off-by: Jason Lowe <jlowe@nvidia.com>
Fix merge conflict from branch-24.02
Signed-off-by: Jason Lowe <jlowe@nvidia.com>
Update to latest branch-24.02 [skip ci]
Signed-off-by: Jason Lowe <jlowe@nvidia.com>
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
[auto-merge] branch-24.02 to branch-24.04 [skip ci] [bot]
…10552) Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
Signed-off-by: Tim Liu <timl@nvidia.com>
* Distinct inner join Signed-off-by: Jason Lowe <jlowe@nvidia.com> * Distinct left join Signed-off-by: Jason Lowe <jlowe@nvidia.com> * Update to new API * Fix test --------- Signed-off-by: Jason Lowe <jlowe@nvidia.com>
Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
…ment (#10564) * WIP Signed-off-by: Gera Shegalov <gera@apache.org> * WIP Signed-off-by: Gera Shegalov <gera@apache.org> * Enable specifying the pytest using file_or_dir args ```bash TEST_PARALLEL=0 \ SPARK_HOME=~/dist/spark-3.1.1-bin-hadoop3.2 \ TEST_FILE_OR_DIR=~/gits/NVIDIA/spark-rapids/integration_tests/src/main/python/arithmetic_ops_test.py::test_addition \ ./integration_tests/run_pyspark_from_build.sh --collect-only <Module src/main/python/arithmetic_ops_test.py> <Function test_addition[Byte]> <Function test_addition[Short]> <Function test_addition[Integer]> <Function test_addition[Long]> <Function test_addition[Float]> <Function test_addition[Double]> <Function test_addition[Decimal(7,3)]> <Function test_addition[Decimal(12,2)]> <Function test_addition[Decimal(18,0)]> <Function test_addition[Decimal(20,2)]> <Function test_addition[Decimal(30,2)]> <Function test_addition[Decimal(36,5)]> <Function test_addition[Decimal(38,10)]> <Function test_addition[Decimal(38,0)]> <Function test_addition[Decimal(7,7)]> <Function test_addition[Decimal(7,-3)]> <Function test_addition[Decimal(36,-5)]> <Function test_addition[Decimal(38,-10)]> ``` Signed-off-by: Gera Shegalov <gera@apache.org> Co-authored-by: Raza Jafri <rjafri@nvidia.com> * Changing to TESTS=module::method Signed-off-by: Gera Shegalov <gera@apache.org> --------- Signed-off-by: Gera Shegalov <gera@apache.org> Co-authored-by: Raza Jafri <rjafri@nvidia.com>
Signed-off-by: Tim Liu <timl@nvidia.com>
Build PASS |
1 similar comment
Build PASS |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change version to 24.04.0
Note: merge this PR with Create a merge commit to merge