From 65bc69fe139da34dc8cc9591cf7fbd3335b3e5d5 Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Wed, 12 Feb 2025 09:32:18 +0100 Subject: [PATCH] Merge Davis proposal with ZEP0009 Remaining text blocks are likely to be re-used under the more general "Extension points" section. see: https://github.com/zarr-developers/zarr-specs/pull/312 --- docs/v3/codecs.rst | 4 ++- docs/v3/core/v3.0.rst | 65 ++++++++++++++++++++++--------------------- 2 files changed, 37 insertions(+), 32 deletions(-) diff --git a/docs/v3/codecs.rst b/docs/v3/codecs.rst index 0bb25363..e1f34bec 100644 --- a/docs/v3/codecs.rst +++ b/docs/v3/codecs.rst @@ -2,7 +2,9 @@ Codecs ====== -Under construction. +The following documents specify codecs which are defined by the maintainers +of the Zarr specification. Being listed below does not imply that a codec is +required to be implemented by implementations. .. toctree:: :glob: diff --git a/docs/v3/core/v3.0.rst b/docs/v3/core/v3.0.rst index 92c227d8..89fcb1ca 100644 --- a/docs/v3/core/v3.0.rst +++ b/docs/v3/core/v3.0.rst @@ -295,7 +295,7 @@ The following figure illustrates the first part of the terminology: *Codec* - The list of *codecs* specified for an array_ determine the encoded byte + The list of *codecs* specified for an array_ determines the encoded byte representation of each chunk in the store_. .. _metadata document: @@ -632,12 +632,9 @@ mandatory names: ^^^^^^^^^^ Specifies a list of codecs to be used for encoding and decoding chunks. The - value must be an array of objects, each object containing a member with - ``name`` whose value is a string referring to a v3 codec specification. The - codec object may also contain a ``configuration`` object which consists of - the parameter names and values as defined by the corresponding codec - specification. Since an ``array -> bytes`` codec must be specified, the - list cannot be empty. + value MUST be an array of extension definitions as defined under TODO. + Because ``codecs`` MUST contain an ``array + -> bytes`` codec, the list cannot be empty (See :ref:`codecs `). The following members are optional: @@ -1204,19 +1201,37 @@ the following procedure: 4. The chunk array ``A`` is equal to ``EC[0]``. -Specifying codecs ------------------ +.. _codec-specification: -To allow for flexibility to define and implement new codecs, this -specification does not define any codecs, nor restrict the set of -codecs that may be used. Each codec must be defined via a separate -specification. In order to refer to codecs in array metadata -documents, each codec must have a unique identifier, which is a URI -that dereferences to a human-readable specification of the codec. A -codec specification must declare the codec identifier, and describe +Core codecs +----------- + +This spec defines a set of well-known codecs ("core codecs") which all Zarr implementations SHOULD implement in +order to ensure a minimal level of interoperability between Zarr implementations. +The list of core codecs is part of the Zarr v3 specification. +Changes to the list of core codecs MUST be made via the same protocol used for +changing the Zarr v3 specification. Changes to the list of core codecs SHOULD be made +in close collaboration with extant Zarr v3 implementations. A new core codec SHOULD be added to the +list when a sufficient number of Zarr implementations support or intend to support that codec. +An existing core codec SHOULD be removed from the list when a sufficient number of implementation +developers and Zarr users deem the codec worth removing, e.g. because of a technical flaw in the +algorithm underlying the codec. + +Extension codecs +---------------- + +To allow for flexibility to define and implement new codecs, the +list of codecs defined for an array MAY contain codecs which are +defined in separate specifications. In order to refer to codecs in array metadata +documents, each codec must have a unique identifier, which is either +a known "raw name" or as a URI as defined under :ref:`extensions_section`. +For ease of discovery, it is +recommended that codec specifications are contributed to the +registry of extensions (TODO). + +A codec specification must declare the codec identifier, and describe (or cite documents that describe) the encoding and decoding algorithms and the format of the encoded data. - A codec may have configuration parameters which modify the behaviour of the codec in some way. For example, a compression codec may have a compression level parameter, which is an integer that affects the @@ -1224,20 +1239,8 @@ resulting compression ratio of the data. Configuration parameters must be declared in the codec specification, including a definition of how configuration parameters are represented as JSON. -The Zarr core development team maintains a repository of codec -specifications, which are hosted alongside this specification in the -`zarr-specs GitHub repository`_, and which are -published on the `zarr-specs documentation Web site -`_. For ease of discovery, it is -recommended that codec specifications are contributed to the -zarr-specs GitHub repository. However, codec specifications may be -maintained by any group or organisation and published in any location -on the Web. For further details of the process for contributing a -codec specification to the zarr-specs GitHub repository, see -`ZEP 0 `_ which describes -the process for Zarr specification changes. - -Further details of how codecs are configured for an array are given in the `Array metadata`_ section. +Further details of how codecs are configured for an array are given in the +`Array metadata`_ section. Stores ======