-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8345431: Detect duplicate entries in jar files with jar --validate #24430
base: master
Are you sure you want to change the base?
Conversation
👋 Welcome back henryjen! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for starting the work on this Henry.
A few initial comments/improvements based on an initial pass of your PR:
- Validate that the Entry names match between the LOC and CEN including the entry order within the headers (ZipOutputStream and most tools will write the LOC/CEN headers in the same order)
- Warn of duplicate entries
- Check that any LOC entry exists in the CEN and any CEN entry exists in the LOC
- Be more specific in the warnings reported such as: Entry XXX found in the LOC but not the CEN
- main.help.opt.main.validate in jar.properties should be updated to indicate additional validation
- jar.md should also be updated for the same reason
- I would use this as an opportunity to add some comments as to what the methods such as validate are now doing given the functions verification has been expanded
It would also be good to validate that the MANIFEST returned ZipFile and ZipInputStream match (this could be follow on work)
@@ -62,20 +62,55 @@ final class Validator { | |||
private Set<String> concealedPkgs = Collections.emptySet(); | |||
private ModuleDescriptor md; | |||
private String mdName; | |||
private final ZipInputStream zis; | |||
private final Set<String> entryNames = new HashSet<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename this to represent the CEN entries.
return new Validator(main, zf, zis).validate(); | ||
} | ||
|
||
private void checkDuplicates(ZipEntry e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a general comment of the purpose as this method is only used with traversing the ZipFile and walking the CEN
} | ||
} | ||
|
||
private void checkZipInputStream() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment on the purpose of the method
try { | ||
ZipEntry e; | ||
while ((e = zis.getNextEntry()) != null) { | ||
var entryName = e.getName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please rename to locEntryName
} | ||
if (!entryNames.contains(entryName)) { | ||
missingEntryNames.add(entryName); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you are checking to see if the LOC entry contains within the CEN but I don't see if you are checking if the CEN entry is contained in the LOC
Another facet of validation is to compare the ordering of entries between the LOC and CEN
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the ordering required by ZIP or Jar format? We can certainly do that if that's under spec and not an implementation detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we checking entry uniqueness and the size match, and all LOC entries should be in CEN, that would means all CEN entries in LOC.
But if we would like to be specific about the inconsistency, then we will have to do a little more work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in a perfect world there will be a 1 to 1 match but either way we should sanity check it in case something happened
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the ordering required by ZIP or Jar format? We can certainly do that if that's under spec and not an implementation detail.
The Zip Spec states the following:
4.3.2 Each file placed into a ZIP file MUST be preceded by a "local
file header" record for that file. Each "local file header" MUST be
accompanied by a corresponding "central directory header" record within
the central directory section of the ZIP file.
That being said I am not aware of any implementations where the order is different given you have to generate the LOC prior to the CEN and End of CEN
@@ -143,6 +143,10 @@ warn.validator.concealed.public.class=\ | |||
Warning: entry {0} is a public class\n\ | |||
in a concealed package, placing this jar on the class path will result\n\ | |||
in incompatible public interfaces | |||
warn.validator.duplicate.entry=\ | |||
Warning: More than one copy of {0} is detected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we know if the duplicate entry is in the CEN or LOC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add more specific message if that's preferred. I am not expecting user/developer to know about file format details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is useful to know where the error is for future analysis
warn.validator.duplicate.entry=\ | ||
Warning: More than one copy of {0} is detected | ||
warn.validator.inconsistent.content=\ | ||
Warning: The list of entries does not match the content |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message could be more specific to the type of error found
@@ -23,7 +23,7 @@ | |||
|
|||
/* | |||
* @test | |||
* @bug 8335912 | |||
* @bug 8335912 8345431 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggesting moving the validation for multiple entries, LOC/CEN mismatches into a separate test
This PR check the jar file to ensure entries are consistent from the central directory and local file header. Also check there is no duplicate entry names that could override the desired content by accident.
Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24430/head:pull/24430
$ git checkout pull/24430
Update a local copy of the PR:
$ git checkout pull/24430
$ git pull https://git.openjdk.org/jdk.git pull/24430/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 24430
View PR using the GUI difftool:
$ git pr show -t 24430
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24430.diff
Using Webrev
Link to Webrev Comment