Incorrect checksum check in get_object
when using Range="0-"
for an object with composite checksum
#3346
Labels
bug
This issue is a confirmed bug.
p1
This is a high priority issue
potential-regression
Marking this issue as a potential regression to be checked by team member
s3
Describe the bug
Making a
get_object
request to the S3 API with and withoutRange="0-"
for an object that was uploaded in multiple parts gives different checksums:Range="0-"
gives the checksum:UymZPQ==-7
Range="0-"
gives the checksum:UymZPQ==
When
Range="0-"
is specified, we don't hit the case here so we consider the checksum a "full object" checksum and try to check the integrity of the downloaded content against it. Clearly it isn't a "full object" checksum, so this case fails incorrectly.The code linked above is copied here for convenience:
This is a regression in v1.36.0 due to boto/boto3#4392.
Regression Issue
Expected Behavior
The checksum check should always pass for a file that was downloaded with no corruption.
Current Behavior
The checksum check fails when calling
get_object
withRange="0-"
with stack trace:Reproduction Steps
Possible Solution
It may be possible to use the
x-amz-checksum-type
header to determine if the checksum isCOMPOSITE
and ignore it if so. However, I'm not sure if this is a good idea, because it will remove checksum verification forget_object
requests that match upload parts and therefore do have a valid checksum.Additional Information/Context
We're calling this API via smart-open and so don't have direct control over the
get_object
parameters. I can file an issue there if this is a misuse ofboto3
, but it seems to me like they're making a valid API call that should be fixed.SDK version used
1.36.3
Environment details (OS name and version, etc.)
Ubuntu 22.04
The text was updated successfully, but these errors were encountered: