-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lzdownload service - download all packages of a given channel #9679
Open
waterflow80
wants to merge
34
commits into
uyuni-project:master
Choose a base branch
from
waterflow80:download-all-service
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Lzdownload service - download all packages of a given channel #9679
waterflow80
wants to merge
34
commits into
uyuni-project:master
from
waterflow80:download-all-service
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
lzreposync will be a spacewalk-repo-sync replacement written in Python. It uses a src layout and a pyproject.toml. The target Python version is 3.11, compatibility with older Python versions is explicitly not a goal.
Added the remote_path column that will hold the remote path/ url of a given package. This information will help locate the package later-on on the remote repository and download it.
A boolean argument that checks whether we should call the header.hdr.fullFilelist() We added this argument to disable the header.hdr.fullFilelist() function only for the lzreposync service.
The inspect.getargspec() method is deprecated in Python 3 It can be replaced by inspect.getfullargspec()
The import_signatures is a boolean argument that specifies whether we should execute the _import_signatures() method. We added this parameter to disable the _import_signatures() method for the lzreposync service.
Parsing the rpm's Primary.xml packages metadata file using pulldom xml parser as a memory efficient parsing library. Note that some attributes in the returned parsed object are faked, and maybe filled in elsewhere. The faking of some of the data is done because some attributes are required by the importer service.
Parsing the rpm's filelists.xml metadata file using pulldom xml parser as a memory efficient parsing library. The parser parses the given filelists.xml file (normally in gz format), and cache the filelist information of each package in a separate file in the cache directory, using the package's hash as the filename, with no file extension.
Using both primary_parser and filelists_parser, return the full packages' metadata, pacakge by package, using lazing parsing. Note that there some attributes that are faked, because we can't fetch them now, and they're required by the package importer later-on. However, we can fake them more efficiently, using less memory.
Parsed the update-info.xml file and imported the parsed patches/updates to the database. We used pretty much the same code from the old Reposync class.
Import the parsed rpm and debian packages to the database in batche, and associate each pacakge with the corresponding channel
Parsed the debian Packages metadata file in a lazy way and yield the metadata of each package separately.
Parsed the debian's Translation file that contains the full description of packages, grouped by description-md5, and cache the parsed descriptions in a cache directory.
Using both packages_parser and translation_parser, return the full packages' metadata, pacakge by package, using lazing parsing Also set the debian repository's information in a DebRepo class
Given the channel label, fetch important repository's information form the database, and store it in a temporary object RepoDTO
Added the necessary command line arguments. Identify the target repositories, prepare the datastructures, and execute the lazy synchronization of repositories/packages.
Added a new dependency python-gnupg used to verify repo signature.
Ignored two linting complains about rasing exceptions floowing the approach in the old reposync. We can enhance the code instead of doing this though.
This commit completes almost all the logic and use cases of the new lazy reposync. **Note** that this commit will be restructured and possibly divided into smaller and more convenient commits. This commit is for review purposes.
Seemingly this error happened because we reached the maximum number of unclosed db connections. And thought that this might be due to the fact that the close() method in the Database class was not implemented, and the rhnSQL.closeDB() was not closing any connection. However, we're still hesitating about whether this is the root cause of the problem, because the old(current) reposync is was using it without any error.
This is the latest and almost the final version of the lzreposync service. (gpg sig check not complete) It contains pretty much all the necessary tests, including the ones for updates/patches import. Some of the remaining 'todos' are either for code enhancements or some unclear concepts that will be discussed with the team. Of course, this commit will be split into smaller ones later after rebase.
- Removed some todos. - Changed some sql queries with equivalent ones using JOIN...ON. - Some other minor cleanup
Optimized some code by changing classes and methods in some logics with free functions. Consolidated the debian repo parsing.
Completed the gpg signature check for rpm repositories, mainly for the repomd.xml file. This is done by downloading the signature file from the remote rpm repo, and executing 'gpg verify' to verify the repomd.xml file against its signature using the already added gpg keys on the filesystem. So, if you haven't already added the required gpg keyring on your system, you'll not be able to verify the repo. You should ideally run this version directly on the uyuni- server, because the gpg keyring will probably be present there.
makedirs() in uyuni.common.fileutils now accepts relative paths that consist of only a directory name or paths with trailing slashes.
Completed the gpg signature check for debian repositories. If you haven't already added the required gpg keyring on your system, you'll not be able to verify the repo, and you'll normally get a GeneralRepoException. You should ideally run this version directly on the uyuni- server, because the gpg keyring will probably be present there.
Mocked the SPACEWALK_GPG_HOMEDIR value to `~/.gnupg/`, which is the default directory for gpg, in order to execute the gpg tests outside the uyuni-server
Made the lzreposync service continuously loop over the existing channels and synchronize the corresponding repositories. Added a status column in the rhnchannel table to indicate the sync status of a given channel. Also added some helper arguments to the service that allows us to perform test operations, like creating a test channel and associating repositories to it, etc
Implemented a first, minimal, working version of the download service, using the download all strategy, meaning that for a given channel, we download all the packages that are linked to that channel. The download directory is hard coded, but it should be further discussed.
72d910d
to
f504fca
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR change?
The
lzdownalod
service will be in charge of using the cached packages' metadata to download the actual binaries (or source rpms).In this PR, we implemented a minimal version of the
download all
strategy, by downloading all the packages of given channel using their cached metadata.Usage
And the packages will be downloaded to the specified location in the filesystem:
Independence
We have separated the
lzdownload
from thelzreposync
so that each service can run independently from each other. This will help in scaling and in the separation of tasks.We may consider putting some functions used by both services in a common location.
GUI diff
No difference.
Documentation
No documentation needed: only internal and user invisible changes
DONE
Test coverage
ℹ️ If a major new functionality is added, it is strongly recommended that tests for the new functionality are added to the Cucumber test suite
No tests: Unit tests will be added on the fly.
DONE
Links
Issue(s): #
Port(s): # add downstream PR(s), if any
Changelogs
Make sure the changelogs entries you are adding are compliant with https://github.com/uyuni-project/uyuni/wiki/Contributing#changelogs and https://github.com/uyuni-project/uyuni/wiki/Contributing#uyuni-projectuyuni-repository
If you don't need a changelog check, please mark this checkbox:
If you uncheck the checkbox after the PR is created, you will need to re-run
changelog_test
(see below)Re-run a test
If you need to re-run a test, please mark the related checkbox, it will be unchecked automatically once it has re-run:
Before you merge
Check How to branch and merge properly!