Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-50716][CORE][FOLLOWUP] Fix the scenario in `JavaUtils#deleteRe…
…cursivelyUsingJavaIO` where `BasicFileAttributes` cannot be read ### What changes were proposed in this pull request? This PR adds protection against IOException (IOE) scenarios when reading the `BasicFileAttributes` of a file in the `deleteRecursivelyUsingJavaIO` method: it catches the IOE and returns null, and silently handles the scenario where `fileAttributes` is null in the subsequent logic. ### Why are the changes needed? When the inode itself does not exist, it is impossible to read its `BasicFileAttributes`, and an IOException (IOE) will be thrown, which caused the failure of the MacOS daily test: - https://github.com/apache/spark/actions/runs/12622568770/job/35170636435 ``` - JobArtifactSet uses resources from SparkContext *** FAILED *** java.nio.file.NoSuchFileException: /Users/runner/work/spark/spark/core/target/tmp/spark-6a6b2d5d-1371-4801-a6c4-59dc9d69c2f2/userFiles-e450317a-136c-49ff-8099-9e8282c766b5/testFile661537940680128228.zip at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106) at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) at java.base/sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55) at java.base/sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:171) at java.base/java.nio.file.Files.readAttributes(Files.java:1853) at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:130) at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:123) at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:94) at org.apache.spark.util.SparkFileUtils.deleteRecursively(SparkFileUtils.scala:121) ... ``` #49347 aimed to fix the cleanup of symbolic links by moving the operation to read `BasicFileAttributes` before `!file.exists` to add a check for broken symbolic links. However, in Spark, there is a logic that first cleans up the potentially existing destination path before overwriting it. The target path being cleaned up may itself be a non-existent inode, such as: https://github.com/apache/spark/blob/91f3fdd25852b43095dd5273358fc394ffd11b66/core/src/main/scala/org/apache/spark/SparkContext.scala#L1879-L1888 Therefore, additional protection is needed for this scenario to maintain compatibility with the old behavior. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass Github Actions - Pass MacOs 15 & Java 21 Github Actions: https://github.com/LuciferYang/spark/runs/35170478542 - Pass Macos PySpark Github Actions: https://github.com/LuciferYang/spark/runs/35178442154 ### Was this patch authored or co-authored using generative AI tooling? No Closes #49357 from LuciferYang/SPARK-50716-FOLLOWUP. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
- Loading branch information