From 8a9201753f36c64b45b6fa91b56864e1ce103bb2 Mon Sep 17 00:00:00 2001 From: Jeff Peck Date: Thu, 21 Dec 2023 14:33:29 -0500 Subject: [PATCH 1/3] Update tutorial.rst to include section about accessing Zip Files on S3 Per discussion here, add information about about accessing zip files on s3: https://github.com/zarr-developers/zarr-python/discussions/1613 --- docs/tutorial.rst | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/docs/tutorial.rst b/docs/tutorial.rst index 4099bac1c8..63967b13c1 100644 --- a/docs/tutorial.rst +++ b/docs/tutorial.rst @@ -1000,6 +1000,32 @@ separately from Zarr. .. _tutorial_copy: +Accessing Zip Files on S3 +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The built-in `ZipStore` will only work with paths on the local file-system, however +it is also possible to access ``.zarr.zip`` data on the cloud. Here is an example of +accessing a ``.zarr.zip`` file on s3: + + >>> s3_path = "s3://path/to/my.zarr.zip" + >>> + >>> s3 = s3fs.S3FileSystem() + >>> f = s3.open(s3_path) + >>> fs = ZipFileSystem(f, mode="r") + >>> store = FSMap("", fs, check=False) + >>> + >>> # cache is optional, but may be a good idea depending on the situation + >>> cache = zarr.storage.LRUStoreCache(store, max_size=2**28) + >>> z = zarr.group(store=cache) + +This store can also be generated with ``fsspec``'s handler chaining, like so: + + >>> store = zarr.storage.FSStore(url=f"zip::{s3_path}", mode="r") + +Note that this is intended for a read-only data source. However, this can be +especially useful if you have a very large ``.zarr.zip`` file on s3 and only need +to access a small portion of it. + Consolidating metadata ~~~~~~~~~~~~~~~~~~~~~~ From 6d8b2f7ab5e35a9733292785febafea7706bbc80 Mon Sep 17 00:00:00 2001 From: Jeff Peck Date: Mon, 25 Dec 2023 12:25:49 -0500 Subject: [PATCH 2/3] Update release.rst --- docs/release.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/release.rst b/docs/release.rst index c18e0b8c20..a6f85224ba 100644 --- a/docs/release.rst +++ b/docs/release.rst @@ -45,6 +45,8 @@ Docs * Minor tweak to advanced indexing tutorial examples. By :user:`Ross Barnowski ` :issue:`1550`. +* Added section about accessing zip files that are on s3. + By :user:`Jeff Peck ` :issue:`1613`. Maintenance ~~~~~~~~~~~ From 95715a8562b6a6e87780f5a2db165f206b24942d Mon Sep 17 00:00:00 2001 From: Josh Moore Date: Tue, 16 Jan 2024 12:58:40 +0100 Subject: [PATCH 3/3] Implement d-v-b's suggestions --- docs/tutorial.rst | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/docs/tutorial.rst b/docs/tutorial.rst index 63967b13c1..351eef064a 100644 --- a/docs/tutorial.rst +++ b/docs/tutorial.rst @@ -1005,7 +1005,7 @@ Accessing Zip Files on S3 The built-in `ZipStore` will only work with paths on the local file-system, however it is also possible to access ``.zarr.zip`` data on the cloud. Here is an example of -accessing a ``.zarr.zip`` file on s3: +accessing a zipped Zarr file on s3: >>> s3_path = "s3://path/to/my.zarr.zip" >>> @@ -1022,9 +1022,8 @@ This store can also be generated with ``fsspec``'s handler chaining, like so: >>> store = zarr.storage.FSStore(url=f"zip::{s3_path}", mode="r") -Note that this is intended for a read-only data source. However, this can be -especially useful if you have a very large ``.zarr.zip`` file on s3 and only need -to access a small portion of it. +This can be especially useful if you have a very large ``.zarr.zip`` file on s3 +and only need to access a small portion of it. Consolidating metadata ~~~~~~~~~~~~~~~~~~~~~~