Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instructions Don't Mention Need for AWS Credentials #7

Closed
bh2smith opened this issue Apr 22, 2024 · 4 comments · Fixed by #9
Closed

Instructions Don't Mention Need for AWS Credentials #7

bh2smith opened this issue Apr 22, 2024 · 4 comments · Fixed by #9

Comments

@bh2smith
Copy link
Collaborator

bh2smith commented Apr 22, 2024

Its not clear from the instructions here or within this project what these AWS credentials are supposed to be. The code suggests these are credentials for the Near Lake Configuration.

lake_config.aws_access_key_id = os.getenv("AWS_ACCESS_KEY_ID")
lake_config.aws_secret_key = os.getenv("AWS_SECRET_ACCESS_KEY")

Where does one acquire these credentials?

@anthony-near
Copy link
Collaborator

@bh2smith try this and if it works I'll update the README with instructions.

Instructions for creating AWS credentials for s3 read access for the indexer (related link):

Step 1: Sign in to the AWS Management Console
Navigate to the AWS Management Console (https://aws.amazon.com/console/).
Sign in using your AWS account credentials.

Step 2: Navigate to IAM Dashboard
In the search bar at the top of the console, type “IAM” and select “IAM Manage access to AWS resources” from the dropdown suggestions.

Step 3: Create a New IAM User
In the IAM dashboard, select “Users” from the navigation pane on the left.
Click the “Add user” button.
Enter a user name for the new user.
Select “Programmatic access” as the access type. This enables an access key ID and secret access key for the AWS API, CLI, SDK, and other development tools.
Click “Next: Permissions”.

Step 4: Set Permissions
Click on “Attach existing policies directly”.
In the search bar, type “AmazonS3ReadOnlyAccess” to find the policy that grants read-only access to S3.
Check the box next to the “AmazonS3ReadOnlyAccess” policy to select it.
Click “Next: Tags” (optional: you can add metadata to the user by attaching tags).
Click “Next: Review”.

Step 5: Review and Create User
Review the user details and the permissions summary to ensure everything is correct.
Click “Create user”.
On the success page, you will see the user’s access key ID and secret access key. Click “Download .csv” to save these credentials. Important: This is the only time you can download or view the secret access key, so make sure to keep it secure.

Step 6: (Optional) Restrict User Access to Specific S3 Buckets
If you need the user to access specific S3 buckets instead of having read access to all S3 buckets, you will need to create a custom policy with restricted permissions and attach it to the user instead of using the AmazonS3ReadOnlyAccess policy.

Step 7: Load AWS Credentials into Environment Variables
To use the AWS credentials securely in your development environment without hardcoding them, you can load the Access Key ID and Secret Access Key into environment variables. This approach enhances security and makes it easier to manage credentials across different environments. Here's how you can set these environment variables on various operating systems:

For macOS or Linux:
Open the Terminal.
Use the export command to set the credentials as environment variables. Replace and with your actual credentials.

export AWS_ACCESS_KEY_ID=<Your-Access-Key-ID>
export AWS_SECRET_ACCESS_KEY=<Your-Secret-Access-Key>

To make these variables persist across sessions, you can add the above commands to your shell's profile script (e.g., ~/.bash_profile, ~/.bashrc, ~/.zshrc, etc.).
Edit the profile script with a text editor and append the export commands.
Save the changes and restart your terminal or source the profile script (e.g., source ~/.bash_profile).

@bh2smith
Copy link
Collaborator Author

bh2smith commented Apr 23, 2024

Ok, so I got myself setup with the AWS credentials. However, now this service is producing a lot of botocore error logs. Like hundreds of these every second:

Traceback (most recent call last):
  File ".venv/lib/python3.12/site-packages/near_lake_framework/s3_fetchers.py", line 59, in fetch_shard_or_retry
    response = await s3_client.get_object(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/aiobotocore/client.py", line 408, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist.

This is happening on the commit before any of my PRs as well as still now.

I tracked it down to near-lake-framework and the key appears to be this:

response = await s3_client.get_object(
                Bucket=s3_bucket_name,
                Key="{:012d}/shard_{}.json".format(block_height, shard_id),
                RequestPayer="requester",
            )

Does this have something to do with my AWS setup?

Update: This appears to be a poorly handled exception in the near-lake-framework. I reported an issue here, still trying to figure out how to import this botocore error class so we can handle missing shards differently.

frolvanya/near-lake-framework-py#11

The idea would be to handle the missing shard without spewing a bunch of dirty traces.

        except botocore.errorfactory.NoSuchKey:
            logging.warning("Failed to fetch shard {}".format(shard_key))
        except Exception:
            traceback.print_exc()

Just need to figure out how to import botocore.errorfactory.NoSuchKey.

Update 2. Made a PR, but it doesn't seem like this project has active maintenance.

If we don't hear anything in the next day or two there are two options:

  1. Fork this near lake project and redistribute and update.
  2. Migrate this repo to Rust and use a real lake framework: https://github.com/near/near-lake-framework-rs.

I am in favour of item 2 (since its still a very small project).

@anthony-near
Copy link
Collaborator

@bh2smith if you don't hear back from on the near-lake-framework-py PR soon, then we can migrate the repo to Rust since the https://github.com/near/near-lake-framework-rs repo is well maintained and Rust is the default for NEAR projects. FYI I might not have the bandwidth, but if I do I can do the rewrite.

@frolvanya
Copy link

@bh2smith @anthony-near Hi, near-lake-framework-py maintainer is here. I've merged all PRs and updated the package on pypi. Sorry, it took longer than expected

bh2smith added a commit that referenced this issue May 29, 2024
1. Drop CI python version to 3.11 (near lake framework doesn't work with 3.12)
2. Load .env in Makefile
3. [Minor] Fix enumeration in Readme
4. 🔑 Adapt main file to use latest near-lake-framework
5. Use >= on requirements file (also version bump near-lake-framework to 0.0.8)
6. Add AWS env details to readme (closes #7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants