Add optional parameter to specify a cloud bucket as an output location? #85

gotdan · 2020-09-03T14:27:03Z

Open Questions:

Auth: Do we want to only target servers that have pre-configured write permissions to a bucket, or do we need a way to pass in auth credentials to the server? If so, what would this look like?

Path: Does the we need additional information in addition to the bucket name, like file prefix (eg. to support a "folder" within the bucket that incorporates a timestamp), or service provider (to support use cases where the server is writing to a bucket provided by a different cloud vendor)?

Completion: Should we require that the output manifest file be written to the bucket last so it could be used as an event to trigger followup actions (eg. a de-id or db load) or would we expect clients to use job polling to determine all files have been written?

jmandel · 2020-09-03T16:35:43Z

Re: auth and paths, I've been super impressed with the open-source rclone project, which has thought carefully and comprehensively about authorization for bucket access.

They have a JSON file describing their schema for cloud storage services, including provider types (e.g., s3) and providers which offer endpoints (e.g., AWS or DigitalOcean or Wasabi, all of whom offer s3-compatible APIs). So I might have a remote configured like:

{
    "access_key_id": "redacted",
    "acl": "private",
    "endpoint": "s3.wasabisys.com",
    "env_auth": "false",
    "provider": "Wasabi",
    "secret_access_key": "redacted",
    "type": "s3"
}

Anyway, if we wanted to standardize on how to convey access, the rclone config format is a great place to look.

We might also try to profile some "common denominator" of shared access signatures / signed URLs at the bucket level.

... but even if we leave authorization out of band, I think having a way to point to a bucket would be lovely.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optional parameter to specify a cloud bucket as an output location? #85

Add optional parameter to specify a cloud bucket as an output location? #85

gotdan commented Sep 3, 2020 •

edited by jmandel

Loading

jmandel commented Sep 3, 2020

Add optional parameter to specify a cloud bucket as an output location? #85

Add optional parameter to specify a cloud bucket as an output location? #85

Comments

gotdan commented Sep 3, 2020 • edited by jmandel Loading

jmandel commented Sep 3, 2020

gotdan commented Sep 3, 2020 •

edited by jmandel

Loading