-
Notifications
You must be signed in to change notification settings - Fork 439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NVIDIA GPU] Introduce Monitoring Integration #11931
Conversation
The documentation for the deprecation of fields indicates the following correspondences: old new is_synthetic_quarantine_disposition pattern_disposition* to identify quarantined files has_script_or_module_ioc ioc_context ioc_values ioc_value However, there is no other information relating to how these correspond with each other. By inspection of documents from an alerts stream, we can see that pattern_disposition_details contains a quarantine_file boolean. This, with the text in the deprecation notice, hints that we can use this field to get the is_synthetic_quarantine_disposition. The ioc_context field contains an array of object with a type property which in the examples I have available include (only) "module", hinting that this can be used to detect the state corresponding to has_script_or_module_ioc. Finally, ioc_value fields are sprinkled around the documents, so collect them into ioc_values. The test case is derived from the first case, but with deprecated fields removed.
…fields from debug data (elastic#11396)
…ls to index (elastic#11372) The event.action field is an implementation detail that has an unfortunate name that could mislead users; the values held for entities do not relate to security details, but only to internal accounting. So remove them.
…stic#11397) * Release aws package with (and add missing data) * Update changelog PR link
…tic#11400) * Bump github.com/elastic/elastic-package from 0.104.0 to 0.105.0 Bumps [github.com/elastic/elastic-package](https://github.com/elastic/elastic-package) from 0.104.0 to 0.105.0. - [Release notes](https://github.com/elastic/elastic-package/releases) - [Changelog](https://github.com/elastic/elastic-package/blob/main/.goreleaser.yml) - [Commits](elastic/elastic-package@v0.104.0...v0.105.0) --- updated-dependencies: - dependency-name: github.com/elastic/elastic-package dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * Remove white line in vsphere README * Added deployment_mode and properties * Added owner info in Elastic Connector * review fixes * remove duplicated changelog entries and update missed team handle * [elastic_connectors] Update manifest version --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jaime Soriano Pastor <jaime.soriano@elastic.co> Co-authored-by: Sean Rathier <sean.rathier@elastic.co> Co-authored-by: Maxim Kholod <maxim.kholod@elastic.co> Co-authored-by: Mario Rodriguez Molins <mario.rodriguez@elastic.co>
…line errors (elastic#11112) Also fix instances of incorrect yaml for script processors.
…logo (elastic#11407) * add a link to the onboarding flow, fix the package logo Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co> * fix pr link Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co> * dashboard: add filter to statefulset vizualisation Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co> --------- Signed-off-by: Tetiana Kravchenko <tetiana.kravchenko@elastic.co>
* add extended space metrics * update changelog * update readme * address review comments. * address review comments * update dashboards --------- Co-authored-by: Niraj Rathod <niraj.rathod@crestdatasys.com>
* reverting session_data toggle * updating PR changelog * fixing change type * reverting kibana changes
Remove the handlebars templating into the CEL code and celfmt.
* Fix AWS Bedrock documentation and dashboard issues
Bumps [github.com/cli/go-gh/v2](https://github.com/cli/go-gh) from 2.10.0 to 2.11.0. - [Release notes](https://github.com/cli/go-gh/releases) - [Commits](cli/go-gh@v2.10.0...v2.11.0) --- updated-dependencies: - dependency-name: github.com/cli/go-gh/v2 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Move pricing info to the top of the table * Update kinesis * Update apigateway * Update billing * Add reference to pricing info * Update aws package * Update aws-bedrock package * Update changelog for AWS package * Update changelog for AWS_Bedrock package * Fix broken link * Update manifest * Update packages/aws/manifest.yml Co-authored-by: Ishleen Kaur <102962586+ishleenk17@users.noreply.github.com> * Fix manifest version * Update manifest after conflicts fix --------- Co-authored-by: Ishleen Kaur <102962586+ishleenk17@users.noreply.github.com>
* update readme * update manifest and changelog
…et handling (elastic#11422) We cannot guarantee the shape of body.message and body.message_detail, so just include the known type resp.Body which should be short in the case of a non-200 response. The next_offset value is documented to be an array of two elements which must be used to construct the parameter by concatenation with a separating comma[1]. [1] https://duo.com/docs/adminapi#authentication-logs
…tic#11439) Bumps [github.com/elastic/elastic-package](https://github.com/elastic/elastic-package) from 0.105.0 to 0.106.0. - [Release notes](https://github.com/elastic/elastic-package/releases) - [Changelog](https://github.com/elastic/elastic-package/blob/main/.goreleaser.yml) - [Commits](elastic/elastic-package@v0.105.0...v0.106.0) --- updated-dependencies: - dependency-name: github.com/elastic/elastic-package dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…1435) * Update package version * Update package version and changelog
In elastic#11240 we attempted to fix preparation of endTime where it could be more than 24 hours after the startTime parameter, a case that is invalid for the API. To do this, we just always added 24 hours to the startTime on the basis that the API would crop at the present moment, and — notably — under the assumption that the returned document would be used to construct the next startTime. Clearly this is not the case, since we use the actual endTime parameter to construct the next startTime; the API does not provide an HTTPJSON-easy way to get the last timestamp. So set the endTime query parameter to the earliest of now and 24 hours after startTime. Queries with an end time of now may not get all documents up to now. Since we are using the query parameter to define the end of the range of documents that have definitively been retrieved, this means that we will never try to get these missed documents. We could us the maximum timestamp of the collected documents to define this range, but that less easy to do here, so just set the end of the time range to be some short period before now. Ten seconds is chosen as a conservative value (the report claimed milliseconds of documents were lost).
The Duo Admin API has rate limiting. It doesn't return rate limit headers, but it does enforce limits with HTTP 429 responses. For some endpoints, the API documentation specifies "a rate limit of 50 calls per minute". This as also been observed on the authentication logs endpoint. This changes sets a limit of 0.5 calls/second or 30 calls per minute for all data streams and inputs. HTTP 429 responses continue to be treated as errors. API documentation: https://duo.com/docs/adminapi --------- Co-authored-by: Dan Kortschak <dan.kortschak@elastic.co>
…>=8.16.0 (elastic#11413) * make Asset Inventory compatible going forward * update manifest and changelog * Update changelog.yml * Update manifest.yml
* Bump up version * update changelog * update manifest version
…ic#11437) * lowercase host.name for cloud_secrity_posture * add PR link to changelog
* remove_events * updating size
In the auth CEL program, the error `type conversion error from 'string' to 'int'` seems to have been happening because of `cursor.last_published` being set to a value of the form `1532951895000,af0ba235-0b33-23c8-bc23-a31aa0231de8`, which can't be parsed as an int. Now `cursor.last_published` is replaced with `cursor.last_timestamp_ms`, which is taken from the last result, and so will be available if the last page of sequence has results but no value in `response.metadata.next_offset`. Also, the date is no longer shared across requests, the request building is simplified, and redundant overrides of state are removed. Related documentation: https://duo.com/docs/adminapi#authentication-logs
* Add required permissions for AWS custom logs * Update changelog and manifest
…stic#11922) Made with ❤️️ by updatecli Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
…stic#11918) * Added support for enterprise audit logs in the audit log data stream.
* Fix broken link to sql module example * Update changelog and manifest
* Replace 8.15 with 8.13 * Replace 8.15 with 8.13 * Update changelog and manifest
The HTTP Headers field (`panw.panos.http_headers`) of the incoming data is incorrectly escaped. This will be fixed if necessary before CSV parsing. Map the file name value in the URL/Filename (`panw.panos.misc`) field for the `wildfire` and `wildfire-virus` sub-types.
Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as |
Hi! This PR has been stale for a while and we're going to close it as part of our cleanup procedure. We appreciate your contribution and would like to apologize if we have not been able to review it, due to the current heavy load of the team. Feel free to re-open this PR if you think it should stay open and is worth rebasing. Thank you for your contribution! |
💔 Build Failed
Failed CI StepsHistory
|
Replaced by #12768 |
Proposed commit message
Introduce NVIDIA GPU Monitoring Integration
Checklist
changelog.yml
file.Author's Checklist
How to test this PR locally
Deploy NVIDIA DGCM on a device with an NVIDIA GPU to get a prometheus metrics endpoint that you can provide to the integration.
If you have docker this just requires:
Configure the integration to point at the host running the container and GPU
http://nvidiahost:9400/metrics
Some metrics are not enabled by default with the container, enabling all metrics requires some extra steps.
Related issues
Fixes #11930
Screenshots
WIP:
data:image/s3,"s3://crabby-images/29eba/29ebada8363292fc26647cb49b4470742b69b793" alt="Screenshot 2024-11-30 at 3 35 33 PM"
data:image/s3,"s3://crabby-images/5b6fd/5b6fdce9c80c73e289eb3243b27182009241dfd4" alt="Screenshot 2024-11-30 at 3 35 44 PM"
data:image/s3,"s3://crabby-images/580d3/580d3cadc676483e4b078883d84e272d442681b4" alt="Screenshot 2024-11-30 at 3 35 56 PM"
data:image/s3,"s3://crabby-images/d4d08/d4d083fe9f6e79534bc110434e52b6d4ce289c0b" alt="Screenshot 2024-11-30 at 3 36 03 PM"