Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design doc: multiple manifests per array #713

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

dcherian
Copy link
Contributor

Ready for 30% feedback. I haven't thought about implementation yet, will look at that by tomorrow lunch time

@dcherian dcherian requested a review from paraseba February 10, 2025 21:33
@kthyng
Copy link

kthyng commented Feb 11, 2025

@dcherian Is this change related to subchunking within a reference file? I've been trying this out in kerchunk and it has been useful. Also is there a possibility of having multiple representations of the same model output within the same reference file? That is, be able to access the same reference file but use one part of it if accessing a time series and another if accessing a single time step. Thanks for any info!

@dcherian
Copy link
Contributor Author

dcherian commented Feb 11, 2025

Is this change related to subchunking within a reference file?

No, it's about splitting up very large manifests. This "subchunking" is already supported

That is, be able to access the same reference file but use one part of it if accessing a time series and another if accessing a single time step.

Presumably these are separated as different arrays or arrays with the same name but in different groups. So that's already supported.

@kthyng
Copy link

kthyng commented Feb 11, 2025

it's about splitting up very large manifests

I guess I mean being both User 1 and User 2 from your text, at different times. So if I want to read in a time series at one depth I would access one reference file (User 1), and another if I want to read all of a few time steps (User 2). Is that the idea?

@dcherian
Copy link
Contributor Author

dcherian commented Feb 13, 2025

I guess I mean being both User 1 and User 2 from your text, at different times. So if I want to read in a time series at one depth I would access one reference file (User 1), and another if I want to read all of a few time steps (User 2). Is that the idea?

Ah, this is the "multiply chunked arrays" idea, no that is not the focus here. It is an idea I like a lot though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants