You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
we've had a few bugs and regressions recently related to how we resolve the names of arrays and groups in storage: see #2830, #2826, #2765, #2716, and maybe more.
here we should discuss how we want to prevent these bugs / regressions, and what our approach to node name resolution (the process of mapping strings onto stored arrays / groups). Whatever we come up with should be properly documented so that the results are eventually not surprising to users.
group.__getitem__ would normalize inputs, so that different strings would resolve to the same path.
group.__getitem__ supported both absolute and relative paths -- a[''] uses a local path to resolve to a (self-reference), but a['/'] uses an absolute path to resolve to the root group. This means a group could overwrite its own parent, e.g. turning its parent into a an array (and thereby self-annihilating).
We are not very consistent with these semantics in zarr right now. Specifically, we don't support absolute paths from group operations, so a.create_group('/b') will make a group at a//b instead of b. I think creating a group at a//b is a bug we should fix, but I'm not so sure that we want group methods to handle absolute paths.
For simplicity, I would suggest that all paths be relative, and also that we take necessary steps to prevent groups from self-annihilating (e.g., preventing groups from getting a self-reference via group['']. The main motivation is simplicity, keeping name resolution easy to understand, and preventing situations where our objects can self-destruct.
I would also prefer an outcome where, for a given zarr group, there is a 1:1 mapping between names and nodes, so that group['a/////'] would raise a keyerror instead of us silently coercing the input to a. But I would like to hear from people who rely on the v2 style behavior to know how disruptive this would be, and to hear any other ideas people might have.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
we've had a few bugs and regressions recently related to how we resolve the names of arrays and groups in storage: see #2830, #2826, #2765, #2716, and maybe more.
here we should discuss how we want to prevent these bugs / regressions, and what our approach to node name resolution (the process of mapping strings onto stored arrays / groups). Whatever we come up with should be properly documented so that the results are eventually not surprising to users.
for reference, here's what we had in v2:
group.__getitem__
would normalize inputs, so that different strings would resolve to the same path.group.__getitem__
supported both absolute and relative paths --a['']
uses a local path to resolve toa
(self-reference), buta['/']
uses an absolute path to resolve to the root group. This means a group could overwrite its own parent, e.g. turning its parent into a an array (and thereby self-annihilating).We are not very consistent with these semantics in zarr right now. Specifically, we don't support absolute paths from group operations, so
a.create_group('/b')
will make a group ata//b
instead ofb
. I think creating a group ata//b
is a bug we should fix, but I'm not so sure that we want group methods to handle absolute paths.For simplicity, I would suggest that all paths be relative, and also that we take necessary steps to prevent groups from self-annihilating (e.g., preventing groups from getting a self-reference via
group['']
. The main motivation is simplicity, keeping name resolution easy to understand, and preventing situations where our objects can self-destruct.I would also prefer an outcome where, for a given zarr group, there is a 1:1 mapping between names and nodes, so that
group['a/////']
would raise akeyerror
instead of us silently coercing the input toa
. But I would like to hear from people who rely on the v2 style behavior to know how disruptive this would be, and to hear any other ideas people might have.Beta Was this translation helpful? Give feedback.
All reactions