You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wish Profiling tool auto-tuner should keep reducing the maxPartitionBytes if a table scan stage has heavy spilling and have OOM tasks.
This is proved to be working for at least one customer job.
My proposal is:
It is a table scan stage
It is spilling a lot(say some threashold)
There are tasks in this stage failing with com.nvidia.spark.rapids.jni.CpuSplitAndRetryOOM: CPU OutOfMemory (Or whatever types of OOM such as GPU OOM, heap OOM, etc)
Then each time auto-tuner keep reducing maxPartitionBytes into half.
The text was updated successfully, but these errors were encountered:
I wish Profiling tool auto-tuner should keep reducing the maxPartitionBytes if a table scan stage has heavy spilling and have OOM tasks.
This is proved to be working for at least one customer job.
My proposal is:
Then each time auto-tuner keep reducing maxPartitionBytes into half.
The text was updated successfully, but these errors were encountered: