-
-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
open_consolidated
raises ValueError
when attempting to read a specific subgroup
#2820
Comments
ValueError
when attempting to read a specific subgroup with consolidated metadataopen_consolidated
raises ValueError
when attempting to read a specific subgroup with consolidated metadata
open_consolidated
raises ValueError
when attempting to read a specific subgroup with consolidated metadataopen_consolidated
raises ValueError
when attempting to read a specific subgroup
when you run root = zarr.open_consolidated(
'example.zarr',
mode='r',
path=""
) then it should work. |
Thanks @d-v-b for your comment. But, what if I only need the want the '/c' node? Do we need to open the whole group and then extract only the wanted node? |
That's the current design, yes. Being able to directly open that group via the |
The only thing giving me pause here is the potential for consolidated metadata at different levels of the hierarchy to have different (potentially inconsistent) consolidated metadata. And I'd be uneasy about
Performance wise, there won't really be a difference, unless your consolidated metadata file is huge. So IMO, this isn't worth trying to support. |
I'll just chime in here and say that I think that consolidated metadata is a bad idea (even though I helped implement it!) and that we should be looking for other more robust solutions to performance and consistency. We have a blog post coming out next week showing how Zarr 3 async concurrency allows us to quickly list large hierarchies without consolidated metadata. |
Yeah, the (lack of) consistency issues make it hard to reason about. I do think that having ways to do in-memory operations on Group and Array structures (like |
Me too! But doing that requires some notion of caching, which requires some understanding when you can safely cache. When you follow this reasoning to its end, you reach Icechunk. 🙃 |
my preferred approach would be to have a very clear distinction between types / data structures that can do IO, and those that do not. The fact that we want zarr groups to be both IO-free models (via metadata consolidation and attribute caching) and objects that can do IO via array / group creation is a source of a lot of problems. There are ways of separating these two concerns, but it would be a big departure from what users expect. |
Thanks @d-v-b, @TomAugspurger, and @rabernat for your comments and thoughts. I’ve refactored the code to first read the root metadata and then extract the required node, as shown here. Please let me know if you have any further comments or suggestions. Otherwise, I’ll proceed to close this issue. |
Zarr version
v3.0.1
Numcodecs version
v0.15.0
Python Version
3.12
Operating System
Linux
Installation
conda
Description
Hi everyone,
I am working on fixing incompatibilities between Zarr-Python V3 and xarray DataTrees PR10020 and I found that when opening a consolidated group it will raise the following error,
Steps to reproduce
when using
zarr.open_consolidated
I got the following errorApparently, the
open _consolidated
function does not detect the consolidated metadata at each nested node.Additional output
No response
The text was updated successfully, but these errors were encountered: