Skip to content

Commit

Permalink
pr feedback
Browse files Browse the repository at this point in the history
  • Loading branch information
normanrz committed Jan 4, 2025
1 parent e96be80 commit a8809b7
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 3 deletions.
5 changes: 3 additions & 2 deletions docs/user-guide/arrays.rst
Original file line number Diff line number Diff line change
Expand Up @@ -581,8 +581,8 @@ With Zarr format 3, a new sharding feature has been added to address this issue.
With sharding, multiple chunks can be stored in a single storage object (e.g. a file).
Within a shard, chunks are compressed and serialized separately.
This allows individual chunks to be read independently.
However, when writing data, a full shard must be written in one go for optimal performance and to
avoid concurrency issues.
However, when writing data, a full shard must be written in one go for optimal
performance and to avoid concurrency issues.
That means that shards are the units of writing and chunks are the units of reading.
Users need to configure the chunk and shard shapes accordingly.

Expand All @@ -608,6 +608,7 @@ Sharded arrays can be created by providing the ``shards`` parameter to :func:`za

In this example a shard shape of (1000, 1000) and a chunk shape of (100, 100) is used.
This means that 10*10 chunks are stored in each shard, and there are 10*10 shards in total.
Without the ``shards`` argument, there would be 10,000 chunks stored as individual files.

Missing features in 3.0
-----------------------
Expand Down
4 changes: 3 additions & 1 deletion docs/user-guide/performance.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,9 @@ Sharding
If you have large arrays but need small chunks to efficiently access the data, you can
use sharding. Sharding provides a mechanism to store multiple chunks in a single
storage object or file. This can be useful because traditional file systems and object
storage systems may have issues storing and accessing many files.
storage systems may have performance issues storing and accessing many files.
Additionally, small files can be inefficient to store if they are smaller than the
block size of the file system.

Picking a good combination of chunk shape and shard shape is important for performance.
The chunk shape determines what unit of your data can be read independently, while the
Expand Down

0 comments on commit a8809b7

Please sign in to comment.