Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missmatch of Array.store_path.store.list_prefix() and Array._iter_chunk_keys() in chunks_initialized() for an array in a group #2861

Open
relativityhd opened this issue Feb 25, 2025 · 2 comments · May be fixed by #2862
Labels
bug Potential issues with the zarr-python library

Comments

@relativityhd
Copy link

Zarr version

v3.0.2

Numcodecs version

v0.15.1

Python Version

3.11

Operating System

Linux

Installation

uv add zarr

Description

Currently, it is not possible to view how many chunks are initialized in an array inside a group.

I was able to track it down to a mismatch of the outputs of the functions z.store_path.store.list_prefix(prefix=z.store_path.path) and z._iter_chunk_keys().

Steps to reproduce

import zarr

# Create a group with two arrays
root = zarr.group("reproduction.zarr")
root.create("x", shape=(1000, 1000), chunks=(100, 100), dtype="f4")
root.create("y", shape=(1000, 1000), chunks=(100, 100), dtype="uint8")
root.tree()
from zarr.core.array import chunks_initialized

# Now check which chunks are initialized (should be an empty tuple)
await chunks_initialized(root["x"])
> ()
# Now fill two chunks
root["x"][0:100, 0:100] = 1.0
root["x"][100:200, 0:100] = 2.0

# Now check which chunks are initialized (should be a tuple with two chunks)
await chunks_initialized(root["x"])
> ()  # <-- This shouldn't be empty
# Further investigate what happens in "chunks_initialized()"
store_contents = [x async for x in root["x"].store_path.store.list_prefix(prefix=root["x"].store_path.path)]
store_contents
> ['x/zarr.json', 'x/c/1/0', 'x/c/0/0']
tuple(chunk_key for chunk_key in root["x"]._iter_chunk_keys())

> ('c/0/0',
 'c/0/1',
 'c/0/2',
 'c/0/3',
 'c/0/4',
 'c/0/5',
 'c/0/6',
 'c/0/7',
 'c/0/8',
 'c/0/9',
 'c/1/0',
 'c/1/1',
 'c/1/2',
 'c/1/3',
 'c/1/4',
...
 'c/9/5',
 'c/9/6',
 'c/9/7',
 'c/9/8',
 'c/9/9')

Additional output

No response

@relativityhd relativityhd added the bug Potential issues with the zarr-python library label Feb 25, 2025
@d-v-b
Copy link
Contributor

d-v-b commented Feb 25, 2025

good report, and sorry for the bug. seems like we need to relativize the output of list_prefix, or absolutifiy the values in iter_chunk_keys. Since this is so obviously wrong that I'm wondering how our tests did not catch it.

@relativityhd
Copy link
Author

No problem, I am glad I could help. :)

@d-v-b d-v-b linked a pull request Feb 25, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants