Skip to content

Commit

Permalink
[SPARK-50716][CORE][FOLLOWUP] Fix the scenario in `JavaUtils#deleteRe…
Browse files Browse the repository at this point in the history
…cursivelyUsingJavaIO` where `BasicFileAttributes` cannot be read

### What changes were proposed in this pull request?
This PR adds protection against IOException (IOE) scenarios when reading the `BasicFileAttributes` of a file in the `deleteRecursivelyUsingJavaIO` method: it catches the IOE and returns null, and silently handles the scenario where `fileAttributes` is null in the subsequent logic.

### Why are the changes needed?
When the inode itself does not exist, it is impossible to read its `BasicFileAttributes`, and an IOException (IOE) will be thrown, which caused the failure of the MacOS daily test:
- https://github.com/apache/spark/actions/runs/12622568770/job/35170636435

```
- JobArtifactSet uses resources from SparkContext *** FAILED ***
  java.nio.file.NoSuchFileException: /Users/runner/work/spark/spark/core/target/tmp/spark-6a6b2d5d-1371-4801-a6c4-59dc9d69c2f2/userFiles-e450317a-136c-49ff-8099-9e8282c766b5/testFile661537940680128228.zip
  at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
  at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
  at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
  at java.base/sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
  at java.base/sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:171)
  at java.base/java.nio.file.Files.readAttributes(Files.java:1853)
  at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:130)
  at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:123)
  at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:94)
  at org.apache.spark.util.SparkFileUtils.deleteRecursively(SparkFileUtils.scala:121)
  ...
```

#49347 aimed to fix the cleanup of symbolic links by moving the operation to read `BasicFileAttributes` before `!file.exists` to add a check for broken symbolic links. However, in Spark, there is a logic that first cleans up the potentially existing destination path before overwriting it. The target path being cleaned up may itself be a non-existent inode, such as:

https://github.com/apache/spark/blob/91f3fdd25852b43095dd5273358fc394ffd11b66/core/src/main/scala/org/apache/spark/SparkContext.scala#L1879-L1888

Therefore, additional protection is needed for this scenario to maintain compatibility with the old behavior.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass Github Actions
- Pass MacOs 15 & Java 21 Github Actions: https://github.com/LuciferYang/spark/runs/35170478542
- Pass Macos PySpark Github Actions: https://github.com/LuciferYang/spark/runs/35178442154

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #49357 from LuciferYang/SPARK-50716-FOLLOWUP.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
  • Loading branch information
LuciferYang authored and dongjoon-hyun committed Jan 6, 2025
1 parent 47825e5 commit fbb4502
Showing 1 changed file with 16 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -126,10 +126,10 @@ public static void deleteRecursively(File file, FilenameFilter filter) throws IO
private static void deleteRecursivelyUsingJavaIO(
File file,
FilenameFilter filter) throws IOException {
BasicFileAttributes fileAttributes =
Files.readAttributes(file.toPath(), BasicFileAttributes.class, LinkOption.NOFOLLOW_LINKS);
// SPARK-50716: If the file does not exist and not a broken symbolic link, return directly.
if (!file.exists() && !fileAttributes.isSymbolicLink()) return;
BasicFileAttributes fileAttributes = readFileAttributes(file);
// SPARK-50716: If the file attributes are null, that is, the file attributes cannot be read,
// or if the file does not exist and is not a broken symbolic link, then return directly.
if (fileAttributes == null || (!file.exists() && !fileAttributes.isSymbolicLink())) return;
if (fileAttributes.isDirectory()) {
IOException savedIOException = null;
for (File child : listFilesSafely(file, filter)) {
Expand All @@ -156,6 +156,18 @@ private static void deleteRecursivelyUsingJavaIO(
}
}

/**
* Reads basic attributes of a given file, of return null if an I/O error occurs.
*/
private static BasicFileAttributes readFileAttributes(File file) {
try {
return Files.readAttributes(
file.toPath(), BasicFileAttributes.class, LinkOption.NOFOLLOW_LINKS);
} catch (IOException e) {
return null;
}
}

private static void deleteRecursivelyUsingUnixNative(File file) throws IOException {
ProcessBuilder builder = new ProcessBuilder("rm", "-rf", file.getAbsolutePath());
Process process = null;
Expand Down

0 comments on commit fbb4502

Please sign in to comment.