-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Updates for Metrics release 1.3.0 #118
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* chore: fix the incorrect name in sample settings.yaml * chore: improve documentation example
chore: Fixes #99 incorrect filename in docs
allowed setting the page size and searching in results as these are now supported in the API
chore: removed UI elements under the opendata module web interface that allowed setting the page size and searching in results as these are now supported in the API
…on to central server Refs: OPMONDEV-181
Refs: OPMONDEV-181
Refs: OPMONDEV-181
…g tests Refs: OPMONDEV-181
feat: Ability to disable certificate verification during connecting to CS. docs: update README and collector docs. fix: fix anonymizer failing tests. Refs: OPMONDEV-181
Currently check `handler is WatchedFileHandler` always fails and a new handler is added every time _setup_logger is called. For example corrector logs every line three times. A fix for opendata was added with pull request #14 This commit makes the same change in all the other services.
Fix duplicate logging
Performing sanitise_document and correct_structure in worker threads instead of main thread. Computations in main thread can use only one CPU core and therefore negatively impact corrector throughput.
Deleting duplicates in threads instead of returning to_remove_queue and slowly processing that in main thread Deleting code that was broken and no longer used after corrector started matching documents by xRequestId: * Removing check if document marked as duplicate exists in clean_data because duplicates get deleted in any case * Removing special handling for duplicate documents without requestInTs Removed addition of deleted raw documents count to the total number of documents processed as these were already part of processed batch
Corrector did not check if processed document already exists in clean_data after corrector started matching documents by xRequestId. Adding duplicate detection.
Using multiprocessing for updating status of orphans that reached timeout. Fetching only document ids instead of full documents.
Using multiprocessing for processing of documents without xRequestId. Adding documents without xRequestId to total number of documents processed.
Corrector is currently using slow and deprecated (mozilla/bleach#698) bleach. Based on the fact that X-Road metrics should not contain HTML it would be more beneficial to just use python translate method and replace potentially dangerous HTML characters. Translate does not parse html and estimated to be 100 times faster than bleach. Using translate method instead of bleach.clean. Renaming sanitise -> sanitize to be consistent with the rest of the code.
Corrector optimization
* MongoDB sometimes selects incorrect index for query * Hint helps to avoid unnecessarily slow queries
As X-Road usually has more service clients than producers, using client subsystem code based index returns fewer rows and performs better
Avoiding unnecessary DB requests to find the list of documents where client and service is the same subsystem. This info is computable from document data. Additionally, fixing invalid duplicate detection when client and producer requestInTs are in different report periods. "get_faulty_documents" did not find duplicates in that case.
Reports query optimization using index hints
Some users want to process report data, but generated PDF is not machine-readable. CSV format can be easily imported into spreadsheets. Adding optional configuration parameter for CSV generation. Adding REPORT_NAME_NO_EXT variable for email template.
Adding support for CSV reports
Refs: OPMONDEV-182
Refs: OPMONDEV-182
…and ignore report from git Refs: OPMONDEV-182
Refs: OPMONDEV-182
Refs: OPMONDEV-182
…e/github actions) Refs: OPMONDEV-182
…p production modules) Refs: OPMONDEV-182
Refs: OPMONDEV-182
Refs: OPMONDEV-182
…onsistency and readability Refs: OPMONDEV-182
Refs: OPMONDEV-182
Refs: OPMONDEV-182
Refs: OPMONDEV-182
Refs: OPMONDEV-182
chore: Update dependencies, apply fixes (code, and tox), and update docs
Bumps the actions-update group in /.github/workflows with 2 updates: [actions/checkout](https://github.com/actions/checkout) and [actions/setup-python](https://github.com/actions/setup-python). Updates `actions/checkout` from 3 to 4 - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@v3...v4) Updates `actions/setup-python` from 4 to 5 - [Release notes](https://github.com/actions/setup-python/releases) - [Commits](actions/setup-python@v4...v5) --- updated-dependencies: - dependency-name: actions/checkout dependency-type: direct:production update-type: version-update:semver-major dependency-group: actions-update - dependency-name: actions/setup-python dependency-type: direct:production update-type: version-update:semver-major dependency-group: actions-update ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps the python-minor-patch group with 1 update in the /corrector_module directory: [freezegun](https://github.com/spulec/freezegun). Updates `freezegun` from 1.0.0 to 1.5.1 - [Release notes](https://github.com/spulec/freezegun/releases) - [Changelog](https://github.com/spulec/freezegun/blob/master/CHANGELOG) - [Commits](spulec/freezegun@1.0.0...1.5.1) --- updated-dependencies: - dependency-name: freezegun dependency-type: direct:production update-type: version-update:semver-minor dependency-group: python-minor-patch ... Signed-off-by: dependabot[bot] <support@github.com>
Refs: OPMONDEV-185
…r_module/python-minor-patch-b8c63cc412
…ns/dot-github/workflows/actions-update-f039b2dc45
chore: bump to version 1.3.0
Refs: OPMONDEV-185
Refs: OPMONDEV-185
doc: fix typos and broken links
|
raits
approved these changes
Jun 28, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Refs: OPMONDEV-185