Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[request]: S3 get folder (directory) size #464

Closed
ZelCloud opened this issue Feb 21, 2022 · 2 comments
Closed

[request]: S3 get folder (directory) size #464

ZelCloud opened this issue Feb 21, 2022 · 2 comments
Labels
feature-request A feature should be added or improved. high-level-library p2 This is a standard priority issue

Comments

@ZelCloud
Copy link

ZelCloud commented Feb 21, 2022

A note for the community

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue, please leave a comment

Tell us about your request

I'd like for the s3 package to also include a way to get the total size for a folder (directory).

Something similar to

aws s3 ls --summarize --human-readable --recursive s3://bucket/folder

or from boto3

import boto3

def get_folder_size(bucket, prefix):
    total_size = 0
    for obj in boto3.resource('s3').Bucket(bucket).objects.filter(Prefix=prefix):
        total_size += obj.size
    return total_size

ex.
bucket_name/folder - 4Gb (total size including subdirectories)

Tell us about the problem you're trying to solve.

We have limits on how much data can be uploaded to a folder so knowing the total size and by extension being able to show it to the user is important. Another issue is if the size of the folder is really high (ex. 100Gb) we'd like to prevent the user from downloading everything in one shot vs a folder of 30mb and a few dozen files.

Are you currently working around this issue?

Still in the process of migrating some services to rust and by extension aws-sdk-rust, so not working around it but it is blocking us from moving over fully.

Additional context

No response

@ZelCloud ZelCloud added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Feb 21, 2022
@rcoh rcoh added high-level-library and removed needs-triage This issue or PR still needs to be triaged. labels Feb 21, 2022
@rcoh
Copy link
Contributor

rcoh commented Feb 21, 2022

Hello and thank for the feature request! The S3 client only contains generated code from models, but this would certainly be good functionality for a high-level library for S3. I'd suggest porting the boto code to Rust for the time being via list_objects_v2: https://docs.rs/aws-sdk-s3/0.6.0/aws_sdk_s3/client/struct.Client.html#method.list_objects_v2

Be sure that you use into_paginator() to ensure you read all pages of results.

use std::error::Error;
use tokio_stream::StreamExt;

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    let shared_config = aws_config::load_from_env().await;
    let s3 = aws_sdk_s3::Client::new(&shared_config);
    let prefix = "prefix";
    let bucket = "bucket";
    let mut pages = s3
        .list_objects_v2()
        .bucket(bucket)
        .prefix(prefix)
        .into_paginator()
        .send();
    let mut total: i64 = 0;
    while let Some(page) = pages.next().await {
        total += page?
            .contents()
            .unwrap_or_default()
            .iter()
            .map(|obj| obj.size)
            .sum::<i64>();
    }
    println!("total size: {}", total);
    Ok(())
}

@jmklix jmklix added the p2 This is a standard priority issue label Nov 28, 2022
@jmklix jmklix closed this as completed Mar 1, 2024
Copy link

github-actions bot commented Mar 1, 2024

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved. high-level-library p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

3 participants