Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release prep for 0.2.2 #771

Merged
merged 5 commits into from
Feb 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

13 changes: 13 additions & 0 deletions Changelog.python.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,18 @@
# Changelog

## Python Icechunk Library 0.2.2

### Features

- Added the ability to checkout a session `as_of` a specific time. This is useful for replaying what the repo would be at a specific point in time.
- Support for refreshable Google Cloud Storage credentials.

### Fixes

- Fix a bug where the clean prefix detection was hiding other errors when creating repositories.
- API now correctly uses `snapshot_id` instead of `snapshot` consistently.
- Only write `content-type` to metadata files if the target object store supports it.

## Python Icechunk Library 0.2.1

### Features
Expand Down
Binary file added docs/docs/assets/storage/tigris-region-set.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
74 changes: 6 additions & 68 deletions docs/docs/icechunk-python/version-control.md
Original file line number Diff line number Diff line change
Expand Up @@ -270,20 +270,16 @@ session2 = repo.writable_session("main")

root1 = zarr.group(session1.store)
root2 = zarr.group(session2.store)
```

First, we'll modify the attributes of the root group from both sessions.

```python
root1.attrs["foo"] = "bar"
root2.attrs["foo"] = "baz"
root1["data"][0,0] = 1
root2["data"][0,:] = 2
```

and then try to commit the changes.

```python
session1.commit(message="Update foo attribute on root group")
session2.commit(message="Update foo attribute on root group")
session1.commit(message="Update first element of data array")
session2.commit(message="Update first row of data array")

# AE9XS2ZWXT861KD2JGHG
# ---------------------------------------------------------------------------
Expand Down Expand Up @@ -327,65 +323,7 @@ session2.rebase(icechunk.ConflictDetector())
# RebaseFailedError: Rebase failed on snapshot AE9XS2ZWXT861KD2JGHG: 1 conflicts found
```

This however fails because both sessions modified metadata. We can use the `ConflictError` to get more information about the conflict.

```python
try:
session2.rebase(icechunk.ConflictDetector())
except icechunk.RebaseFailedError as e:
print(e.conflicts)

# [Conflict(ZarrMetadataDoubleUpdate, path=/)]
```

This tells us that the conflict is caused by the two sessions modifying the metadata attributes of the root group (`/`). In this case we have decided that second session set the `foo` attribute to the correct value, so we can now try to rebase by instructing the `rebase` method to use the second session's changes with the [`BasicConflictSolver`](../reference/#icechunk.BasicConflictSolver).

```python
session2.rebase(icechunk.BasicConflictSolver())
```

Success! We can now try and commit the changes again.

```python
session2.commit(message="Update foo attribute on root group")

# 'SY4WRE8A9TVYMTJPEAHG'
```

This same process can be used to resolve conflicts with arrays. Let's try to modify the `data` array from both sessions.

```python
session1 = repo.writable_session("main")
session2 = repo.writable_session("main")

root1 = zarr.group(session1.store)
root2 = zarr.group(session2.store)

root1["data"][0,0] = 1
root2["data"][0,:] = 2
```

We have now created a conflict, because the first session modified the first element of the `data` array, and the second session modified the first row of the `data` array. Let's commit the changes from the second session first, then see what conflicts are reported when we try to commit the changes from the first session.

```python
print(session2.commit(message="Update first row of data array"))
print(session1.commit(message="Update first element of data array"))

# ---------------------------------------------------------------------------
# ConflictError Traceback (most recent call last)
# Cell In[15], line 2
# 1 print(session2.commit(message="Update first row of data array"))
# ----> 2 print(session1.commit(message="Update first element of data array"))

# File ~/Developer/icechunk/icechunk-python/python/icechunk/session.py:224, in Session.commit(self, message, metadata)
# 222 return self._session.commit(message, metadata)
# 223 except PyConflictError as e:
# --> 224 raise ConflictError(e) from None

# ConflictError: Failed to commit, expected parent: Some("SY4WRE8A9TVYMTJPEAHG"), actual parent: Some("5XRDGZPSG747AMMRTWT0")
```

Okay! We have a conflict. Lets see what conflicts are reported.
This however fails because both sessions modified metadata. We can use the `RebaseFailedError` to get more information about the conflict.

```python
try:
Expand Down Expand Up @@ -470,4 +408,4 @@ root["data"][:,:]

#### Limitations

At the moment, the rebase functionality is limited to resolving conflicts with attributes on arrays and groups, and conflicts with chunks in arrays. Other types of conflicts are not able to be resolved by icechunk yet and must be resolved manually.
At the moment, the rebase functionality is limited to resolving conflicts with chunks in arrays. Other types of conflicts are not able to be resolved by icechunk yet and must be resolved manually.
4 changes: 2 additions & 2 deletions icechunk-python/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "icechunk-python"
version = "0.2.1"
version = "0.2.2"
description = "Transactional storage engine for Zarr designed for use on cloud object storage"
readme = "../README.md"
repository = "https://github.com/earth-mover/icechunk"
Expand All @@ -21,7 +21,7 @@ crate-type = ["cdylib"]
bytes = "1.9.0"
chrono = { version = "0.4.39" }
futures = "0.3.31"
icechunk = { path = "../icechunk", version = "0.2.1", features = ["logs"] }
icechunk = { path = "../icechunk", version = "0.2.2", features = ["logs"] }
itertools = "0.14.0"
pyo3 = { version = "0.23", features = [
"chrono",
Expand Down
2 changes: 1 addition & 1 deletion icechunk/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "icechunk"
version = "0.2.1"
version = "0.2.2"
description = "Transactional storage engine for Zarr designed for use on cloud object storage"
readme = "../README.md"
repository = "https://github.com/earth-mover/icechunk"
Expand Down