DOC: create_array(..., data=,...) #2809
Labels
documentation
Improvements to the documentation
help wanted
Issue could use help from someone with familiarity on the topic
Describe the issue linked to the documentation
I am very confused about the argument
data
increate_array
. A common use case is to simply serialize an in memory array, in which case I tend to pass it as thedata=in_memory_array
argument. However, I cannot find thedata
argument in the documentation.Using IPyhon, on the other hand,
zarr.create_array
clearly has the argument, whilezarr.Group.create_array
doesn't seem to expose the interface. I am quite confused about the discrepancy. If this is intentional, please document it.LLM also suggest that
is more efficient than
I have no idea whether this is true or not.
zarr.create_array(..., data=in_memory_data)
might be indeed more efficient as it seems to be written asynchronously. But the documentation seems to by quite lacking, what the best practice is.This might be a bit out of scope for this issue, this issue, so please tell me if it's out of scope. But from the documentation, I don't really see how to leverage the asynchronous nature of the
zarr
implementation. A common pattern I encounter is, that data is generated in parallel using multiprocessing (as it is CPU bound) and persisted usingzarr
(probably disc bound). Is there a preferred pattern, to usezarr
as an asynchronous sink for the generated data? If so, it would be great to include it in the docs.Suggested fix for documentation
No response
The text was updated successfully, but these errors were encountered: