-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTAGENT-254 Add support for enhanced RBAC permissions for otel-agent #1693
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good to me, I think you have a typo, but that's about it.
charts/datadog/ci/agent-otel-collector-with-rbac-custom-rules-values.yaml
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,40 @@ | |||
{{- if and .Values.agents.rbac.create (eq (include "should-enable-otel-agent" .) "true") .Values.datadog.otelCollector.rbac.create -}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviewers: should-enable-otel-agent
defined here
c9ff915
to
a9632ef
Compare
63669c7
to
8338179
Compare
8338179
to
c27e362
Compare
539855a
to
54c9632
Compare
ef4cdae
to
cc84e27
Compare
de36f4c
to
72d38b7
Compare
/merge |
View all feedbacks in Devflow UI.
The median merge time in
|
What this PR does / why we need it:
Add support for additional RBAC permissions required by
k8sattributes
processor.RBAC Permission Issues in otel-agent
The OpenTelemetry Collector’s k8sattributes processor requires specific RBAC permissions to connect to the Kubernetes API server in order to enrich telemetry data with Kubernetes metadata. However, the default Service Account created by the Helm chart doesn't have these permissions: node agent (deployed as DaemonSet) retrieves pod information from the Kubelet
/pod
endpoints. This approach avoids overloading the Kubernetes API server in large clusters, but leads to errors in otel-agent when k8sattributes enabled: (Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User <serviceAccount> cannot list resource "pods" in API group "" at the cluster scope
)Proposed Solution
Create the necessary RBAC ClusterRole and ClusterRoleBinding for the
k8sattributes
processor. The creation logic will be tied to a new boolean parameter –datadog.otelCollector.rbac.create
, which, when set totrue
, enables the chart to inspect the OTel configuration. If anyk8sattributes
processor is set topassthrough: false
(i.e., it needs to call the Kubernetes API), the Helm chart will generate the required RBAC resources.Implementation Details
datadog.otelCollector.rbac.create
(true
|false
).agents.rbac.create: true
anddatadog.otelCollector.rbac.create: true
, the Helm chart checks the OTel Collector configuration for anyk8sattributes
processors configured withpassthrough
option disabled (either by omitting it or by explicitly setting itpassthrough: false
).datadog.otelCollector.rbac.rules: []
agents.rbac.create:true
anddatadog.otelCollector.rbac.create:true
, anddatadog.otelCollector.rbac.rules
provided, these rules are added to theotel-agent
ClusterRole.Special notes for your reviewer:
Why Cluster Role for additional RBAC permissions?
With just a role binding, the
k8sattributes
processor cannot query metadata such as labels and annotations from k8snodes
andnamespaces
which are cluster-scoped objects. This also means that the processor cannot set the value fork8s.cluster.uid
attribute if enabled, since thek8s.cluster.uid
attribute is set to the uid of the namespacekube-system
which is not queryable with namespaced rbac.Why introduce a separate
datadog.otelCollector.rbac
parameter vs reusingagents.rbac
?Note: technically, any ClusterRole and ClusterRoleBinding introduced under
datadog.otelCollector.rbac
will apply to the entire pod housing the Datadog Agent with Embedded OTel Collector. While this may not perfectly map to the container-based naming convention, it provides a convenient and explicit toggle for customers. Existing precedent – datadog.secretBackend.roles parameter: scoped to Secret Backend feature; changes affect the default Service Account, and thus additional permissions available to all containers in the node agent pod.Checklist
.github/helm-docs.sh
)CHANGELOG.md
has been updatedREADME.md