-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
specs - profiling integration: Make host.id in registration message o… #900
Merged
Merged
Changes from 3 commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
543b18d
specs - profiling integration: Make host.id in registration message o…
florianl db9108d
Update specs/agents/universal-profiling-integration.md
florianl 64941db
Update specs/agents/universal-profiling-integration.md
florianl 02432a6
Update specs/agents/universal-profiling-integration.md
florianl File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -179,8 +179,8 @@ All messages have the following layout: | |||||
|
||||||
## Profiler Registration Message | ||||||
|
||||||
Whenever the profiling host agent starts communicating for the first time with a process running an APM Agent, it MUST send this message. | ||||||
This message is used to let the APM-agent know that a profiler is actually active on the current host. Note that an APM-agent may receive this message zero, one or several times: this may happen if no host agent is active, if one is active or if a host agent is restarted during the lifetime of the APM-agent respectively. | ||||||
Whenever the profiling agent starts communicating for the first time with a process running an APM Agent, it MUST send this message. | ||||||
This message is used to let the APM-agent know that a profiler is actually active on the current host. Note that an APM-agent may receive this message zero, one or several times: this may happen if no profiling agent is active, if one is active or if a profiling agent is restarted during the lifetime of the APM-agent respectively. | ||||||
|
||||||
The *message-type* is `2` and the current *minor-version* is `1`. | ||||||
|
||||||
|
@@ -190,8 +190,8 @@ Name | Data type | |||||
samples-delay-ms | uint32 | ||||||
host-id | utf8-str | ||||||
|
||||||
* *samples-delay-ms*: A sane upper bound of the usual time taken in milliseconds by the profiling host agent between the collection of a stacktrace and it being written to the apm-agent via the [messaging socket](#cpu-profiler-trace-correlation-message). The APM-agent will assume that all profiling data related to a span has been written to the socket if a span ended at least the provided duration ago. Note that this value doesn't need to be a hard a guarantee, but it should be the 99% case so that profiling data isn't distorted in the expected case. | ||||||
* *host-id*: The [`host.id` resource attribute](https://opentelemetry.io/docs/specs/semconv/attributes-registry/host/) used for the profiling data by this profiling host agent. If an APM-agent is already sending a `host.id` it MUST print a warning if the `host.id` is different and otherwise ignore the value received by the host agent. A mismatch will lead to certain correlation features (e.g. cost and CO2 consumption) not working. If an agent does not collect the `host.id` by itself, it MUST start sending the `host.id` after receiving it from the profiler host agent to ensure aforementioned correlation features work correctly. | ||||||
* *samples-delay-ms*: A sane upper bound of the usual time taken in milliseconds by the profiling agent between the collection of a stacktrace and it being written to the apm-agent via the [messaging socket](#cpu-profiler-trace-correlation-message). The APM-agent will assume that all profiling data related to a span has been written to the socket if a span ended at least the provided duration ago. Note that this value doesn't need to be a hard a guarantee, but it should be the 99% case so that profiling data isn't distorted in the expected case. | ||||||
florianl marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
* *host-id*: The [`host.id` resource attribute](https://opentelemetry.io/docs/specs/semconv/attributes-registry/host/) is an optional argument (the string may have a length of zero) used to correlate profiling data by the profiling agent. If an APM-agent is already sending a `host.id` it MUST print a warning if the `host.id` is different and otherwise ignore the value received by the profiling agent. A mismatch will lead to certain correlation features (e.g. cost and CO2 consumption) not working. If an APM-agent does not collect the `host.id` by itself, it MUST start sending the `host.id` after receiving a non-empty `host.id` from the profiling agent to ensure aforementioned correlation features work correctly. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not including this change for the same reason as with https://github.com/elastic/apm/pull/900/files#r1901586158. |
||||||
|
||||||
|
||||||
## CPU Profiler Trace Correlation Message | ||||||
|
@@ -236,7 +236,7 @@ For example, if for a single transaction the following correlation messages are | |||||
|
||||||
the resulting transaction MUST have the OpenTelemetry attribute `elastic.profiler_stack_trace_ids` with a value of (elements in any order) `[YLQguzhR2dR6y5M9vnA5mw, YLQguzhR2dR6y5M9vnA5mw, TJMmu5gF-o-FiCwS6uckzg, YLQguzhR2dR6y5M9vnA5mw]`. | ||||||
|
||||||
Note that the [correlation messages](#cpu-profiler-trace-correlation-message) will arrive delayed relative to when they were sampled due to the processing delay of the profiling host agent and the transfer over the domain socket. APM agents therefore MUST defer sending ended transactions until they are relatively confident that all correlation messages for the transaction have arrived. | ||||||
Note that the [correlation messages](#cpu-profiler-trace-correlation-message) will arrive delayed relative to when they were sampled due to the processing delay of the profiling agent and the transfer over the domain socket. APM agents therefore MUST defer sending ended transactions until they are relatively confident that all correlation messages for the transaction have arrived. | ||||||
|
||||||
* When a [profiler registration message](#profiler-registration-message) has been received, APM agents SHOULD use the duration from that message as delay for transactions | ||||||
* If no [profiler registration message](#profiler-registration-message) has been received yet, APM agents SHOULD use a default of one second as reasonable default delay. | ||||||
|
@@ -262,4 +262,4 @@ OpenTelemetry based agents SHOULD use the following configuration options: | |||||
|
||||||
* `ELASTIC_OTEL_UNIVERSAL_PROFILING_INTEGRATION_BUFFER_SIZE` | ||||||
|
||||||
The size of the FIFO queue [used to buffer transactions](#correlation-attribute) until all correlation data has arrived. Should have a reasonable default to sustain typical transaction per second rates while not occupying too much memory in edge cases (e.g. 8096). | ||||||
The size of the FIFO queue [used to buffer transactions](#correlation-attribute) until all correlation data has arrived. Should have a reasonable default to sustain typical transaction per second rates while not occupying too much memory in edge cases (e.g. 8096). |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, alternatively feel free to use APM-Agent everywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specification uses a mix of
apm-agent
,APM agent
andAPM-agent
. Streamlining wording for the whole specification just for the part that got changes with this PR feels not right. So I will keep the current mix of wording to not mix the streamlining with the purpose of this change.