Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

htsget download endpoints #695

Merged
merged 30 commits into from
Mar 15, 2024
Merged

htsget download endpoints #695

merged 30 commits into from
Mar 15, 2024

Conversation

MalinAhlberg
Copy link
Contributor

@MalinAhlberg MalinAhlberg commented Feb 27, 2024

Related issue(s) and PR(s)
This PR closes NBISweden/LocalEGA-SE-Deployment#725 and neicnordic/sda-download#372.

It is based on the work in neicnordic/sda-download#327.

Description

  • Adds endpoint /s3-encrypted which is similar to /s3 but returns an encrypted file.
  • Adds support for returning parts of encrypted files, either with url param endCoordinate or - for encrypted files - with the http header Range: bytes=0-1000. The later overrides the first, and is strictly interpreted, while endCoordinate will be used to find the nearest crypt4gh data block boundary and cut the file there.
  • Minor modifications to work with htsget-rs (see https://github.com/GenomicDataInfrastructure/starter-kit-htsget/tree/feature/rust-htsget)

How to test

  • get a token: token=$(curl -s -k https://localhost:8080/tokens | jq -r '.[0]')
  • to download the complete file: curl 'http://localhost:8443/s3-encrypted/DATASET0001/htsnexus_test_NA12878.bam' -H "Authorization: Bearer $token" -o test.bam.c4gh
  • to download the first 3000 bytes: curl 'http://localhost:8443/s3-encrypted/DATASET0001/htsnexus_test_NA12878.bam' -H "Authorization: Bearer $token" -o test3000.bam.c4gh -H "Range: bytes=0-3000
  • to download the first 3000 bytes, and adding extra bytes to align to the data package boundaries : curl 'http://localhost:8443/s3-encrypted/DATASET0001/htsnexus_test_NA12878.bam?endCoordinate=3000' -H "Authorization: Bearer $token" -o test3000ok.bam.c4gh
  • to get the file info curl --head -i 'http://localhost:8443/s3-encrypted/DATASET0001/htsnexus_test_NA12878.bam' -H "Authorization: Bearer $token"

Further comments

@MalinAhlberg MalinAhlberg force-pushed the feat/htsget-downoad-endpoints branch 5 times, most recently from af7a735 to 8a27350 Compare March 4, 2024 15:14
@MalinAhlberg MalinAhlberg changed the title htsget downoad endpoints htsget download endpoints Mar 4, 2024
@MalinAhlberg MalinAhlberg marked this pull request as ready for review March 4, 2024 15:20
@MalinAhlberg MalinAhlberg requested a review from a team March 4, 2024 15:20
Copy link
Contributor

@blankdots blankdots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small things

sda-download/api/api.md Show resolved Hide resolved
sda-download/internal/config/config.go Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
@MalinAhlberg MalinAhlberg force-pushed the feat/htsget-downoad-endpoints branch from faf9d4c to c49f140 Compare March 5, 2024 13:14
sda-download/api/s3/s3.go Outdated Show resolved Hide resolved
@pontus
Copy link
Contributor

pontus commented Mar 5, 2024

I'm not sure what's in scope for this PR, should header reencryption be included or not? Some things may be premature without that.

@MalinAhlberg
Copy link
Contributor Author

MalinAhlberg commented Mar 5, 2024

I'm not sure what's in scope for this PR, should header reencryption be included or not? Some things may be premature without that.

Valid points! The scope of this PR is to be able to communicate and do requests with htsget + sda-download. Some future work is mentioned under Further comments in the PR body.
We will use these headers in communication with htsget, for the header encryption.

So, no, all things are not in place yet, but maybe good as a first step?

@blankdots
Copy link
Contributor

blankdots commented Mar 6, 2024

I would say, if we can keep the changes in the scope of htsget download endpoints that is ok, but removing the c4gh.password and adding info and options about re-encryption e.g. Client-public-key or Server-public-key maybe we should do the changes with that, as it will be easier to trace.

the reason i am mentioning this is because is while for htsget re-encryption might work in a certain way and that is ok, for serving decrypted filles we might need to generate fresh key pair every time (see changes in sda.go from https://github.com/neicnordic/sda-download/pull/330/files) and those changes need to come together.

sda-download/api/api.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@jbygdell jbygdell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some tests and error handling that's missing

sda-download/api/api.go Show resolved Hide resolved
sda-download/api/sda/sda.go Outdated Show resolved Hide resolved
sda-download/api/sda/sda.go Outdated Show resolved Hide resolved
sda-download/api/sda/sda.go Outdated Show resolved Hide resolved
sda-download/api/sda/sda_test.go Outdated Show resolved Hide resolved
sda-download/api/sda/sda.go Outdated Show resolved Hide resolved
sda-download/api/api.md Outdated Show resolved Hide resolved
sda-download/api/api.md Outdated Show resolved Hide resolved
sda-download/api/api.md Outdated Show resolved Hide resolved
sda-download/api/api.md Outdated Show resolved Hide resolved
sda-download/api/api.md Outdated Show resolved Hide resolved
sda-download/api/api.md Outdated Show resolved Hide resolved
sda-download/api/sda/sda.go Outdated Show resolved Hide resolved
only set the content-length for HEAD requests. The crypt4gh header
should be added to the size.
@MalinAhlberg MalinAhlberg force-pushed the feat/htsget-downoad-endpoints branch from 2784d99 to 7994e9f Compare March 14, 2024 10:30
Copy link
Contributor

@aaperis aaperis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@dbampalikis dbampalikis dismissed jbygdell’s stale review March 15, 2024 07:25

Requesting new review

@jbygdell jbygdell added this pull request to the merge queue Mar 15, 2024
Merged via the queue into main with commit 7b3df33 Mar 15, 2024
26 checks passed
@jbygdell jbygdell deleted the feat/htsget-downoad-endpoints branch March 15, 2024 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants