-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distinguish between the nature of images #1532
Conversation
8fbcc78
to
37b5879
Compare
This is the output after syncing data https://github.com/pulp/pulp_container/pkgs/container/bootc-labeled.
Detail exposes additional fields for |
0a1d1ee
to
cad750c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR it does exactly what it meant to, however, I would definitely rework how you are initializing the metadata because you are reading from file multiple times data you have access to and can just pass it. This is a concern especially with objects storage because you pay for every get operation on the file.
Also this PR code path is tightly related to the artifactless manifest, so Michal will need to rework after you, how init_metadata is done. If you don't want to address my previous point I would suggest then wait until artifactless manifests/config is done.
pulp_container/app/registry.py
Outdated
@@ -460,6 +460,9 @@ async def init_pending_content(self, digest, manifest_data, media_type, artifact | |||
media_type=media_type, | |||
config_blob=config_blob, | |||
) | |||
|
|||
await sync_to_async(manifest.init_metadata)(artifact) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why would you be passing artifact here, that you will need to open and parse again if you already have parsed manifest data on line 447
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to have init_metadata(manifest_data, config_data=None) where you would be passing json loadd data, this way you will not need to open and parse files multiple times
you will need to adjust save_config_blob to return both config_blob, config_blob_data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I like the idea.
pulp_container/app/registry_api.py
Outdated
@@ -1181,6 +1181,11 @@ def put(self, request, path, pk=None): | |||
) | |||
manifest = manifest_list | |||
|
|||
# once relations for listed manifests are established, | |||
# it is possible to initialize the metadata, like labels | |||
manifest.init_metadata(artifact) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here and further down
@@ -35,6 +35,9 @@ | |||
# a repository containing 4 manifest lists and 5 manifests | |||
PULP_FIXTURE_1 = "pulp/test-fixture-1" | |||
|
|||
# a dummy repository containing two manifests with an arbitrary bootc label |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two manifests(index and image)
pulp_container/app/viewsets.py
Outdated
class ManifestViewSet(ContainerContentQuerySetMixin, ReadOnlyContentViewSet): | ||
class ManifestViewSetDetailMixin: | ||
""" | ||
A mixin that allows differentiating between listing manifests and retrieving a single manifest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the nutshell this is nice idea, in practice i think users will rely to see same fields for manifests whether on list of get. I.e. they will need to write 2 different parsers(a bit of inconvenience) that will be able to cope with no labels/annotations fields during listing and take them into account when getting a specific instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You do recommend returning annotations
and labels
for the list view too, right? Or, have you considered not exposing such fields at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no no, i mean exposing them for list and get.
@mdellweg any opinion on the approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a hard question that should ideally be answered on the application level. Do we promise our users in general that the objects from list and view calls are equivalent.
I fear that ship has sailed already. We do have similar things for tasks and rpm packages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to use one and only ManifestSerializer
to not do any other gimmicks. Thanks for your inputs!
|
||
def update_manifest(manifest): | ||
with suppress(ObjectDoesNotExist): | ||
artifact = manifest._artifacts.get() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't know if we should keep this as a migration or rather a management command. this is expensive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Let me write a new populate-annotations-labels
command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see 0033_raise_warning_for_repair.py
We reached a consensus on adopting the requested changes and merging this before #1530. Thanks for pointing out the possible cost of each artifact.file.read operation! I will fix that! |
3cc2588
to
7fe828c
Compare
"Manifests containing additional metadata, like image manifest annotations and " | ||
"image configuration labels, need to be updated. Please, run 'pulpcore-manager " | ||
"init-image-nature' to extract and initiliaze the metadata from related artifacts." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update this to a more friendlier manner: If you are interested in initializing annotations and labels, run this django command...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without interest :D , just 'run django command to initialize...'
pulp_container/app/models.py
Outdated
@@ -107,6 +118,73 @@ class Manifest(Content): | |||
through_fields=("image_manifest", "manifest_list"), | |||
) | |||
|
|||
def init_metadata(self, manifest_data=None, manifest_artifact=None, config_artifact=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def init_metadata(self, manifest_data=None, manifest_artifact=None, config_artifact=None): | |
def init_metadata(self, manifest_data=None): |
These arguments are not used anywhere right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good!
@@ -0,0 +1,2 @@ | |||
Incorporated a notion of container images' characteristics. Users can now filter manifests by their | |||
nature using the ``is_flatpak`` or ``is_bootable`` field on the corresponding Manifest endpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please mention that you've exposed labels and annotations
self.update_manifests(manifest_lists) | ||
|
||
def update_manifests(self, manifests_qs): | ||
paginator = Paginator(manifests_qs, PAGE_CHUNK_SIZE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can safely go to 1000
"Manifests containing additional metadata, like image manifest annotations and " | ||
"image configuration labels, need to be updated. Please, run 'pulpcore-manager " | ||
"init-image-nature' to extract and initiliaze the metadata from related artifacts." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without interest :D , just 'run django command to initialize...'
@@ -674,3 +675,8 @@ def _post_save(self, batch): | |||
BlobManifest.objects.bulk_create(blob_manifests, ignore_conflicts=True) | |||
if manifest_list_manifests: | |||
ManifestListManifest.objects.bulk_create(manifest_list_manifests, ignore_conflicts=True) | |||
|
|||
# the nature of manifest lists can be determined after creating the listed manifest relation | |||
for ml in manifest_lists: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this operations is not really related to relation creation, can you rename the variable from manifest_lits to something more explicit, like manifests_lists_nature_to_init
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to say in the comment that we can run this initialization only after creating the relations between listed manifests and manifest lists. It appears that the combination of the comment and variable name causes confusion. 😞 Let me remove the comment and rename the variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the comment is helpful. you can leave it + rename the var ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I decided to rewrite the comment instead:
# after creating the relation between listed manifests and manifest lists,
# it is possible to initialize the nature of the corresponding manifest lists
How do you like it now? The suggested explicit name is hard to read and understand. We iterate over manifest lists and thus it does make sense to call the list "manifest_lists".
7fe828c
to
5b3c4f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some non-blocking comments.
def print_warning_for_initializing_image_nature(apps, schema_editor): | ||
warnings.warn( | ||
"Run 'pulpcore-manager init-image-nature' to initialize metadata (i.e., annotations " | ||
"and labels) for all manifests. The metadata are stored in associated artifacts and " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metadata is singular, hence is stored
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove last sentence, what kind of important info it brings? "their contents should be accessible through the Manifest model now" what contents? all contents? or just selected ones.
Users are not interested in tech.details of how a manifests is composed, imo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a shorter concise version: "Run 'pulpcore-manager init-image-nature' to initialize and expose metadata (i.e., annotations and labels) for the manifests"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to be explicit when it comes to reading data from the storage back-end. If administrators run this migration, they might be a bit surprised by the the rise in expenses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the migration itself does not read from disk anything, it is the django management command that it does
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant the django-management command migration. We are getting lost in this terminology because we always talk about migrating data, xixi.
5b3c4f2
to
fa1c4c4
Compare
Merging because the failures are unrelated to the PR. |
closes #1437