-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Hadoop 2/3 compatible with Java SDK for IAM #7634
Conversation
No linked issues found. Please add the corresponding issues in the pull request description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
This one is hard to test; I suppose the real test will be that a system test works with lakeFS.
* GetCallerIdentityV4Presigner is that knows how to generate a presigned URL for the GetCallerIdentity API. The presigned URL is signed using SigV4. | ||
* This class is extending AWS4Signer of AWS SDK version 1.7.4 and copies some functions from https://github.com/aws/aws-sdk-java/blob/1.7.4/src/main/java/com/amazonaws/auth/AWS4Signer.java | ||
* The reason we copy some functions is that we need to support aws-hadoop-2 which depends on aws sdk 1.7.4 while aws-hadoop-3 depends on aws sdk 1.11.375. | ||
* Everything that is copied starts with "overridden" prefix, a reasonable alternative would be to use @Override but, AWS made those functions final. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
* GetCallerIdentityV4Presigner is that knows how to generate a presigned URL for the GetCallerIdentity API. The presigned URL is signed using SigV4. | |
* This class is extending AWS4Signer of AWS SDK version 1.7.4 and copies some functions from https://github.com/aws/aws-sdk-java/blob/1.7.4/src/main/java/com/amazonaws/auth/AWS4Signer.java | |
* The reason we copy some functions is that we need to support aws-hadoop-2 which depends on aws sdk 1.7.4 while aws-hadoop-3 depends on aws sdk 1.11.375. | |
* Everything that is copied starts with "overridden" prefix, a reasonable alternative would be to use @Override but, AWS made those functions final. | |
* GetCallerIdentityV4Presigner is generates a presigned URL for the GetCallerIdentity API, signed using SigV4. | |
* This class extends AWS4Signer of AWS SDK version 1.7.4 and copies some functions from https://github.com/aws/aws-sdk-java/blob/1.7.4/src/main/java/com/amazonaws/auth/AWS4Signer.java | |
* The copied functions exist in AWS SDK 1.7.4 but not AWS SDK 1.11.375, so they are | |
not available on Hadoop AWS 3. | |
* Copied code has an "overridden" prefix: cannot use @Override as those functions are final. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated in the new implementation
@@ -108,15 +107,39 @@ public Request<GeneratePresignGetCallerIdentityRequest> newPresignedRequest() th | |||
|
|||
public String newPresignedGetCallerIdentityToken() throws Exception { | |||
Request<GeneratePresignGetCallerIdentityRequest> signedRequest = this.newPresignedRequest(); | |||
Map<String, ?> rawQueryParams = signedRequest.getParameters(); | |||
Map<String, String> params = new HashMap<>(); | |||
// check if the value is an array and join it with commas depends on the AWS SDK version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what "depends on the AWS SDK version" means here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted
clients/hadoopfs/src/main/java/io/lakefs/auth/AWSLakeFSTokenProvider.java
Show resolved
Hide resolved
StringBuilder pattern = new StringBuilder(); | ||
|
||
pattern | ||
.append(Pattern.quote("+")) | ||
.append("|") | ||
.append(Pattern.quote("*")) | ||
.append("|") | ||
.append(Pattern.quote("%7E")) | ||
.append("|") | ||
.append(Pattern.quote("%2F")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider:
StringBuilder pattern = new StringBuilder(); | |
pattern | |
.append(Pattern.quote("+")) | |
.append("|") | |
.append(Pattern.quote("*")) | |
.append("|") | |
.append(Pattern.quote("%7E")) | |
.append("|") | |
.append(Pattern.quote("%2F")); | |
StringBuilder pattern = new StringBuilder() | |
.append(Pattern.quote("+")) | |
.append("|") | |
.append(Pattern.quote("*")) | |
.append("|") | |
.append(Pattern.quote("%7E")) | |
.append("|") | |
.append(Pattern.quote("%2F")); |
clients/hadoopfs/src/main/java/io/lakefs/auth/STSGetCallerIdentityPresigner.java
Show resolved
Hide resolved
@@ -28,7 +28,7 @@ public TemporaryAWSCredentialsLakeFSTokenProvider(String scheme, Configuration c | |||
} | |||
AWSCredentialsProvider awsProvider = new AWSCredentialsProvider() { | |||
@Override | |||
public AWSCredentials getCredentials() { | |||
public AWSCredentials getCredentials() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
public AWSCredentials getCredentials() { | |
public AWSCredentials getCredentials() { |
@Test | ||
public void name() { | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this?
Hey @arielshaqed, thanks for the review so far, sorry in advance for this but after you mentioned the CI failed I noticed a big flaw in what I did, so then I had to rethink the issue and do something else. Specifically regarding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay, this is good stuff!
Please avoid "copy from v2", and prefer instead to specify the specific version of the AWS SDK from which code was copied, or the specific version of Hadoop if that is relevant.
* GetCallerIdentityV4Presigner is that knows how to generate a presigned URL for the GetCallerIdentity API. | ||
* The presigned URL is signed using SigV4. | ||
* TODO: when we move to AWS SDK v2, we can use the AWS SDK's implementation of this (depends on hadoop-aws upgrading their own AWS SDK dependency). | ||
* * GetCallerIdentityV4Presigner is generates a presigned URL for the GetCallerIdentity API, signed using SigV4. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* * GetCallerIdentityV4Presigner is generates a presigned URL for the GetCallerIdentity API, signed using SigV4. | |
* * GetCallerIdentityV4Presigner generates a presigned URL for the GetCallerIdentity API, signed using SigV4. |
@@ -17,14 +18,22 @@ public void testProviderIdentityTokenSerde() throws Exception { | |||
conf.set("fs.lakefs." + Constants.TOKEN_AWS_CREDENTIALS_PROVIDER_SESSION_TOKEN_KEY_SUFFIX, "sessionToken"); | |||
conf.set("fs.lakefs." + Constants.TOKEN_AWS_STS_ENDPOINT, "https://sts.amazonaws.com"); | |||
|
|||
AWSLakeFSTokenProvider provider = (AWSLakeFSTokenProvider)LakeFSTokenProviderFactory.newLakeFSTokenProvider(Constants.DEFAULT_SCHEME, conf); | |||
AWSLakeFSTokenProvider provider = (AWSLakeFSTokenProvider) LakeFSTokenProviderFactory.newLakeFSTokenProvider(Constants.DEFAULT_SCHEME, conf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we need to cast here, it seems like an upcast that is usually implicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i need it here because the newLakeFSTokenProvider()
returns LakeFSTokenProvider
while im testing specific code in AWSLakeFSTokenProvider
(AWS) and i need the method newPresignedGetCallerIdentityToken
Our hadoop-lakeFS supports hadoop 2 and 3 contracts and introduces the following dependencies:
Those AWS SDK versions has breaking changes between them.
When working with AWS SDK that means the code will not compile.
To solve this issue I copied the minimal as possible code required to make it work on both versions.