Skip to content

Commit fbbfc29

Browse files
bcdurakAlexejPenneravishniakovactions-userschustmi
authored
Follow-up on the run_metadata changes (#3193)
* Initial commit, nuking all metadata responses and seeing what breaks * Removed last remnant of LazyLoader * Reintroducing the lazy loaders. * Add LazyRunMetadataResponse to EntrypointFunctionDefinition * Test for lazy loaders works now * Fixed tests, reformatted * Use updated template * Auto-update of Starter template * Updated more templates * Fixed failing test * Fixed step run schemas * Auto-update of E2E template * Auto-update of NLP template * Fixed tests, removed additional .value access * Further fixing * Fixed linting issues * Reformatted * Linted, formatted and tested again * Typing * Maybe fix everything * Apply some feedback * new operation * new log_metadata function * changes to the base filters * new filters * adding log_metadata to __all__ * checkpoint with float casting * adding tests * final touches and formatting * formatting * moved the utils * modified log metadata function * checkpoint * deprecating the old functions * linting and final fixes * better error message * fixing the client method * better error message * consistent creation\ * adjusting tests * linting * changes for step metadata * more test adjustments * testing unit tests * linting * fixing more tests * fixing more tests * more test fixes * fixing the test * fixing per comments * added validation, constant error message * linting * new changes * second checkpoint * fixing revisions * adding overlap to remove warnings * complete docs changes * adding a parameter to control the related entity behaviour * fixing the toc * fixed the description * docstring * spellcheck * metadata creation during artifact version creation * allowing artifact metadata with name for external artifact * update the template versions * Auto-update of LLM Finetuning template * Auto-update of Starter template * Auto-update of E2E template * Auto-update of NLP template * fixing the migration script * formatting * redirects * minor fixes * working pipelines again * small fix * working checkpoint * fixes, linting, docstrings * fixing unit tests * docs updates 1 * docs update 2 * fixing integration tests * spellcheck * formatting * Auto-update of E2E template * docs changes * review comments * added the batch rbac call * added a validator to check the name of the keys * small adjustments * base schema added * formatting * new functionalities * breaking circular imports * spellchecker * other minor fixes * covering the uncovered case * adjusting tests * fixing the quickstart again * minor change * going back to publisher step id * updating github refs * Auto-update of LLM Finetuning template * Auto-update of Starter template * fixing tests * updated docs * Auto-update of E2E template * Auto-update of NLP template * formatting * review comments * adding some tests in * review comments * Update src/zenml/zen_stores/migrations/versions/cc269488e5a9_separate_run_metadata.py Co-authored-by: Michael Schuster <schustmi@users.noreply.github.com> * Update src/zenml/zen_stores/migrations/versions/cc269488e5a9_separate_run_metadata.py Co-authored-by: Michael Schuster <schustmi@users.noreply.github.com> * Update src/zenml/zen_stores/migrations/versions/cc269488e5a9_separate_run_metadata.py Co-authored-by: Michael Schuster <schustmi@users.noreply.github.com> * Update src/zenml/zen_stores/migrations/versions/cc269488e5a9_separate_run_metadata.py Co-authored-by: Michael Schuster <schustmi@users.noreply.github.com> * Update src/zenml/zen_stores/migrations/versions/cc269488e5a9_separate_run_metadata.py Co-authored-by: Michael Schuster <schustmi@users.noreply.github.com> * changed assert to value error * fixed the alembic head * changed the interaction with the models * trimmed down * small bugfix * naming recommendations * linting * fixing the test --------- Co-authored-by: AlexejPenner <thealexejpenner@gmail.com> Co-authored-by: Andrei Vishniakov <31008759+avishniakov@users.noreply.github.com> Co-authored-by: GitHub Actions <actions@github.com> Co-authored-by: Michael Schuster <michael.schuster.ffb@googlemail.com> Co-authored-by: Michael Schuster <schustmi@users.noreply.github.com>
1 parent 0ccb1fd commit fbbfc29

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+1482
-566
lines changed

.gitbook.yaml

+1
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ redirects:
1818
how-to/setting-up-a-project-repository/best-practices: how-to/project-setup-and-management/setting-up-a-project-repository/set-up-repository.md
1919
getting-started/zenml-pro/system-architectures: getting-started/system-architectures.md
2020
how-to/build-pipelines/name-your-pipeline-and-runs: how-to/pipeline-development/build-pipelines/name-your-pipeline-runs.md
21+
how-to/model-management-metrics/track-metrics-metadata/attach-metadata-to-steps: how-to/model-management-metrics/track-metrics-metadata/attach-metadata-to-a-step.md
2122

2223
# ZenML Pro
2324
getting-started/zenml-pro/user-management: getting-started/zenml-pro/core-concepts.md

.github/workflows/update-templates-to-examples.yml

+4-4
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ jobs:
4646
python-version: ${{ inputs.python-version }}
4747
stack-name: local
4848
ref-zenml: ${{ github.ref }}
49-
ref-template: 2024.11.20 # Make sure it is aligned with ZENML_PROJECT_TEMPLATES from src/zenml/cli/base.py
49+
ref-template: 2024.11.28 # Make sure it is aligned with ZENML_PROJECT_TEMPLATES from src/zenml/cli/base.py
5050
- name: Clean-up
5151
run: |
5252
rm -rf ./local_checkout
@@ -118,7 +118,7 @@ jobs:
118118
python-version: ${{ inputs.python-version }}
119119
stack-name: local
120120
ref-zenml: ${{ github.ref }}
121-
ref-template: 2024.10.30 # Make sure it is aligned with ZENML_PROJECT_TEMPLATES from src/zenml/cli/base.py
121+
ref-template: 2024.11.28 # Make sure it is aligned with ZENML_PROJECT_TEMPLATES from src/zenml/cli/base.py
122122
- name: Clean-up
123123
run: |
124124
rm -rf ./local_checkout
@@ -189,7 +189,7 @@ jobs:
189189
python-version: ${{ inputs.python-version }}
190190
stack-name: local
191191
ref-zenml: ${{ github.ref }}
192-
ref-template: 2024.10.30 # Make sure it is aligned with ZENML_PROJECT_TEMPLATES from src/zenml/cli/base.py
192+
ref-template: 2024.11.28 # Make sure it is aligned with ZENML_PROJECT_TEMPLATES from src/zenml/cli/base.py
193193
- name: Clean-up
194194
run: |
195195
rm -rf ./local_checkout
@@ -261,7 +261,7 @@ jobs:
261261
with:
262262
python-version: ${{ inputs.python-version }}
263263
ref-zenml: ${{ github.ref }}
264-
ref-template: 2024.11.08 # Make sure it is aligned with ZENML_PROJECT_TEMPLATES from src/zenml/cli/base.py
264+
ref-template: 2024.11.28 # Make sure it is aligned with ZENML_PROJECT_TEMPLATES from src/zenml/cli/base.py
265265
- name: Clean-up
266266
run: |
267267
rm -rf ./local_checkout

docs/book/how-to/model-management-metrics/track-metrics-metadata/README.md

+38-4
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,44 @@ description: Tracking metrics and metadata
55

66
# Track metrics and metadata
77

8-
Logging metrics and metadata is standardized in ZenML. The most common pattern is to use the `log_xxx` methods, e.g.:
8+
ZenML provides a unified way to log and manage metrics and metadata through
9+
the `log_metadata` function. This versatile function allows you to log
10+
metadata across various entities like models, artifacts, steps, and runs
11+
through a single interface. Additionally, you can adjust if you want to
12+
automatically the same metadata for the related entities.
913

10-
* Log metadata to a [model](attach-metadata-to-a-model.md): `log_model_metadata`
11-
* Log metadata to an [artifact](attach-metadata-to-an-artifact.md): `log_artifact_metadata`
12-
* Log metadata to a [step](attach-metadata-to-steps.md): `log_step_metadata`
14+
### The most basic use-case
15+
16+
You can use the `log_metadata` function within a step:
17+
18+
```python
19+
from zenml import step, log_metadata
20+
21+
@step
22+
def my_step() -> ...:
23+
log_metadata(metadata={"accuracy": 0.91})
24+
...
25+
```
26+
27+
This will log the `accuracy` for the step, its pipeline run, and if provided
28+
its model version.
29+
30+
### Additional use-cases
31+
32+
The `log_metadata` function also supports various use-cases by allowing you to
33+
specify the target entity (e.g., model, artifact, step, or run) with flexible
34+
parameters. You can learn more about these use-cases in the following pages:
35+
36+
- [Log metadata to a step](attach-metadata-to-a-step.md)
37+
- [Log metadata to a run](attach-metadata-to-a-run.md)
38+
- [Log metadata to an artifact](attach-metadata-to-an-artifact.md)
39+
- [Log metadata to a model](attach-metadata-to-a-model.md)
40+
41+
{% hint style="warning" %}
42+
The older methods for logging metadata to specific entities, such as
43+
`log_model_metadata`, `log_artifact_metadata`, and `log_step_metadata`, are
44+
now deprecated. It is recommended to use `log_metadata` for all future
45+
implementations.
46+
{% endhint %}
1347

1448
<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>
Original file line numberDiff line numberDiff line change
@@ -1,62 +1,93 @@
11
---
2-
description: >-
3-
Attach any metadata as key-value pairs to your models for future reference and
4-
auditability.
2+
description: Learn how to attach metadata to a model.
53
---
64

75
# Attach metadata to a model
86

7+
ZenML allows you to log metadata for models, which provides additional context
8+
that goes beyond individual artifact details. Model metadata can represent
9+
high-level insights, such as evaluation results, deployment information,
10+
or customer-specific details, making it easier to manage and interpret
11+
the model's usage and performance across different versions.
12+
913
## Logging Metadata for Models
1014

11-
While artifact metadata is specific to individual outputs of steps, model metadata encapsulates broader and more general information that spans across multiple artifacts. For example, evaluation results or the name of a customer for whom the model is intended could be logged with the model.
15+
To log metadata for a model, use the `log_metadata` function. This function
16+
lets you attach key-value metadata to a model, which can include metrics and
17+
other JSON-serializable values, such as custom ZenML types like `Uri`,
18+
`Path`, and `StorageSize`.
1219

1320
Here's an example of logging metadata for a model:
1421

1522
```python
16-
from zenml import step, log_model_metadata, ArtifactConfig, get_step_context
1723
from typing import Annotated
24+
1825
import pandas as pd
19-
from sklearn.ensemble import RandomForestClassifier
2026
from sklearn.base import ClassifierMixin
27+
from sklearn.ensemble import RandomForestClassifier
28+
29+
from zenml import step, log_metadata, ArtifactConfig, get_step_context
30+
2131

2232
@step
23-
def train_model(dataset: pd.DataFrame) -> Annotated[ClassifierMixin, ArtifactConfig(name="sklearn_classifier")]:
24-
"""Train a model"""
25-
# Fit the model and compute metrics
33+
def train_model(dataset: pd.DataFrame) -> Annotated[
34+
ClassifierMixin, ArtifactConfig(name="sklearn_classifier")
35+
]:
36+
"""Train a model and log model metadata."""
2637
classifier = RandomForestClassifier().fit(dataset)
2738
accuracy, precision, recall = ...
28-
29-
# Log metadata for the model
30-
# This associates the metadata with the ZenML model, not the artifact
31-
log_model_metadata(
39+
40+
log_metadata(
3241
metadata={
3342
"evaluation_metrics": {
3443
"accuracy": accuracy,
3544
"precision": precision,
3645
"recall": recall
3746
}
3847
},
39-
# Omitted model_name will use the model in the current context
40-
model_name="zenml_model_name",
41-
# Omitted model_version will default to 'latest'
42-
model_version="zenml_model_version",
48+
infer_model=True,
4349
)
50+
4451
return classifier
4552
```
4653

47-
In this example, the metadata is associated with the model rather than the specific classifier artifact. This is particularly useful when the metadata reflects an aggregation or summary of various steps and artifacts in the pipeline.
54+
In this example, the metadata is associated with the model rather than the
55+
specific classifier artifact. This is particularly useful when the metadata
56+
reflects an aggregation or summary of various steps and artifacts in the
57+
pipeline.
58+
59+
60+
### Selecting Models with `log_metadata`
61+
62+
When using `log_metadata`, ZenML provides flexible options of attaching
63+
metadata to model versions:
64+
65+
1. **Using `infer_model`**: If used within a step, ZenML will use the step
66+
context to infer the model it is using and attach the metadata to it.
67+
2. **Model Name and Version Provided**: If both a model name and version are
68+
provided, ZenML will use these to identify and attach metadata to the
69+
specific model version.
70+
3. **Model Version ID Provided**: If a model version ID is directly provided,
71+
ZenML will use it to fetch and attach the metadata to that specific model
72+
version.
4873

4974
## Fetching logged metadata
5075

51-
Once metadata has been logged in an [artifact](attach-metadata-to-an-artifact.md), model, or [step](attach-metadata-to-steps.md), we can easily fetch the metadata with the ZenML Client:
76+
Once metadata has been attached to a model, it can be retrieved for inspection
77+
or analysis using the ZenML Client.
5278

5379
```python
5480
from zenml.client import Client
5581

5682
client = Client()
5783
model = client.get_model_version("my_model", "my_version")
5884

59-
print(model.run_metadata["metadata_key"].value)
85+
print(model.run_metadata["metadata_key"])
6086
```
6187

88+
{% hint style="info" %}
89+
When you are fetching metadata using a specific key, the returned value will
90+
always reflect the latest entry.
91+
{% endhint %}
92+
6293
<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
---
2+
description: Learn how to attach metadata to a run.
3+
---
4+
5+
# Attach Metadata to a Run
6+
7+
In ZenML, you can log metadata directly to a pipeline run, either during or
8+
after execution, using the `log_metadata` function. This function allows you
9+
to attach a dictionary of key-value pairs as metadata to a pipeline run,
10+
with values that can be any JSON-serializable data type, including ZenML
11+
custom types like `Uri`, `Path`, `DType`, and `StorageSize`.
12+
13+
## Logging Metadata Within a Run
14+
15+
If you are logging metadata from within a step that’s part of a pipeline run,
16+
calling `log_metadata` will attach the specified metadata to the current
17+
pipeline run where the metadata key will have the `step_name::metadata_key`
18+
pattern. This allows you to use the same metadata key from different steps
19+
while the run's still executing.
20+
21+
```python
22+
from typing import Annotated
23+
24+
import pandas as pd
25+
from sklearn.base import ClassifierMixin
26+
from sklearn.ensemble import RandomForestClassifier
27+
28+
from zenml import step, log_metadata, ArtifactConfig
29+
30+
31+
@step
32+
def train_model(dataset: pd.DataFrame) -> Annotated[
33+
ClassifierMixin,
34+
ArtifactConfig(name="sklearn_classifier", is_model_artifact=True)
35+
]:
36+
"""Train a model and log run-level metadata."""
37+
classifier = RandomForestClassifier().fit(dataset)
38+
accuracy, precision, recall = ...
39+
40+
# Log metadata at the run level
41+
log_metadata(
42+
metadata={
43+
"run_metrics": {
44+
"accuracy": accuracy,
45+
"precision": precision,
46+
"recall": recall
47+
}
48+
}
49+
)
50+
return classifier
51+
```
52+
53+
## Manually Logging Metadata to a Pipeline Run
54+
55+
You can also attach metadata to a specific pipeline run without needing a step,
56+
using identifiers like the run ID. This is useful when logging information or
57+
metrics that were calculated post-execution.
58+
59+
```python
60+
from zenml import log_metadata
61+
62+
log_metadata(
63+
metadata={"post_run_info": {"some_metric": 5.0}},
64+
run_id_name_or_prefix="run_id_name_or_prefix"
65+
)
66+
```
67+
68+
## Fetching Logged Metadata
69+
70+
Once metadata has been logged in a pipeline run, you can retrieve it using
71+
the ZenML Client:
72+
73+
```python
74+
from zenml.client import Client
75+
76+
client = Client()
77+
run = client.get_pipeline_run("run_id_name_or_prefix")
78+
79+
print(run.run_metadata["metadata_key"])
80+
```
81+
82+
{% hint style="info" %}
83+
When you are fetching metadata using a specific key, the returned value will
84+
always reflect the latest entry.
85+
{% endhint %}
86+
87+
<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>

0 commit comments

Comments
 (0)