-
Notifications
You must be signed in to change notification settings - Fork 439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[azure logs] enable azure-eventhub input v2 #12802
base: main
Are you sure you want to change the base?
Conversation
🚀 Benchmarks reportTo see the full report comment with |
Pinging @elastic/security-service-integrations (Team:Security-Service Integrations) |
@@ -480,6 +501,47 @@ Examples: | |||
|
|||
This setting can also be used to define your own endpoints, like for hybrid cloud models. | |||
|
|||
### Input v2 settings (advanced) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is "input v2" and "integration v2" used in this document. Are they referring to the same thing? If so, I'd suggest deciding on one and using it consistently. If not, add some text briefly but clearly explaining their relationship.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm glad you asked because I need a perspective from someone unfamiliar with these things.
"input v2" and "integration v2" are two different things.
Let me share more; probably, you can help me with this naming problem.
- Integration v2: is a new integration w/ routing within the Azure Logs package; we designed it to replace all other integrations with only one input per event hub instead of multiple ones.
- Input v2: is the new version of the Filebeat azure-eventhub input that uses the new Event Hub SDK.
The azure-eventhub input supports the legacy (v1) and the modern (v2) SDK and can switch from using a config option. See https://github.com/elastic/beats/blob/cd883f511c3cc5664a15f0fd66fe6a83ae655c27/x-pack/filebeat/input/azureeventhub/input.go#L70-L77 for more.
Even if some users know the internals, we usually do not mention inputs in integration docs. However, I would love to convey that they can stay with the old v1 'engine' or pick the new v2.
Instead of 'input,' I considered using the name 'processor.' The legacy and modern SDKs have the concept of 'processor,' so it makes sense to use it.
Let me know which name you think is clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that makes sense.
Not a direct answer, but when I am talking to others who don't deal with the internals, I use the term "collector" to describe inputs because it explains what the function of the entity is. This term is currently unladen in the context of beats and so is reasonably safe (though this may change due to OTel, and adding the "data" prefix is probably a good idea). To maintain this agnosticism, you'd try to avoid putting it in a setting where it gains a life, so for the heading something like "New data collector settings (advanced)" or "Version 2 data collector sett…" or "Azure SDK v2 data collec…" (versus "Legacy data coll…").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhm, "data collector" is a good name; I regret not considering it when we added the new Event Hub SDK in Beats.
We added the "processor" concept to the azure-eventhub input in Beats 8.15 to have both legacy v1 and modern v2 side by side. I feel this "processor" has become part of the API, and we may need to stick to it.
Having a versioned input component should stay an advanced thing. I anticipate the processor v2 graduating as the default processor in 6-9 months as we add more v2-only features like Entra ID authentication options (service principal, workload identity, etc).
Is "processor" a better component name than "input," and is it good enough for an advanced scope setting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern about "processor" is that it's overloaded with these. But, having said that, I have no answers, only questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposal. I removed the 'input' concept from the docs and replaced it with 'processor'.
I removed 'input' because technically, we have one input, with a configurable internal component called 'processor.'
Both legacy and modern SDKs identify the 'processor' as the internal component responsible for processing messages fetched by event hub partition consumers.
Co-authored-by: Dan Kortschak <dan.kortschak@elastic.co>
Co-authored-by: Dan Kortschak <dan.kortschak@elastic.co>
@@ -20,9 +20,33 @@ storage_account: {{storage_account}} | |||
{{#if storage_account_key}} | |||
storage_account_key: {{storage_account_key}} | |||
{{/if}} | |||
{{#if storage_account_connection_string}} | |||
: {{storage_account_connection_string}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we not processing the storage_account_connection_string
through user input?
If this values doesn't come as input from the user, Then we could remove the IF
condition. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense.
We can change the UI elements once when we'll add the Entra ID authentication options for the processor v2.
packages/azure/docs/events.md
Outdated
|
||
The input v2 is in preview. Input v1 is still the default and is recommended for typical use cases. | ||
|
||
See the "Settings" section for more details about enabling the input v2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should remove the "(advanced)" from the title, and only leave "input/processor v2" only settings.
packages/azure/docs/events.md
Outdated
_boolean_ | ||
(v2 only) Flag to control if the input should perform the checkpoint information migration from v1 to v2 at startup. The checkpoint migration converts the checkpoint information from the v1 format to the v2 format. | ||
|
||
Default is `false`, which means the input will not perform the checkpoint migration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
We offer no storage_account_connection_string setting option in the UI yet. We'll revise the UI once for the upcoming Entra ID auth settings.
Introduces the concept of processor as a configurable option available starting from stack 8.15.1 and integration 1.23.0. I removed 'input' because technically, we have one input, with a a configurable internal component called 'processor.' Both legacy and modern SDKs identify the 'processor' as the internal component responsible for processing messages fetched by event hub partition consumers.
💚 Build Succeeded
History
cc @zmoog |
|
Proposed commit message
The PR adds the new
processor_version
configuration option (an related ones) to enable the azure-eventhub input v2 in the Azure Logs integration v2 preview.storage_account_connection_string
config option (required for input v2); the policy template builds a default value using thestorage_account_key
but also offers an override option.The input v2 uses the modern Event Hub SDK.
Notes for reviewers
In this PR, I am adding the input v2 settings to the advanced section of the integration v2 only (
events
data stream).The goal is to enable only input v2 for the data stream that can avoid contention among inputs.
However, nothing can stop users from enabling v2 AND one or more v1 integrations. So, adding the input v2 settings to the global scope is better for simplicity.
I would love to hear your thoughts about placing the input v2 settings: data stream vs. global level.
Checklist
changelog.yml
file.Author's Checklist
How to test this PR locally
elastic-package
(min stack version 8.15.1)processor_version
tov2
Related issues
Screenshots