Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable system-probe on GKE Autopilot #1453

Merged
merged 31 commits into from
Feb 28, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
0affc11
Changes for system-probe on GKE Autopilot
hmahmood Jul 11, 2024
f57c180
Merge remote-tracking branch 'origin/main' into hasan.mahmood/system-…
hmahmood Jul 16, 2024
64f9b2f
Fix os-release mounts
hmahmood Jul 16, 2024
83c3f9e
Update version and changelog
hmahmood Jul 16, 2024
9e2d995
Update README
hmahmood Jul 17, 2024
560e258
Merge remote-tracking branch 'origin/main' into hasan.mahmood/system-…
hmahmood Jul 17, 2024
25caa09
Merge remote-tracking branch 'origin/main' into hasan.mahmood/system-…
hmahmood Sep 3, 2024
2bbe0fb
Bump chart to 3.71
hmahmood Sep 4, 2024
055cbb8
Merge branch 'main' into hasan.mahmood/system-probe-autopilot
fanny-jiang Jan 29, 2025
b746a4e
Minor fixes for WorkloadAllowlist (#1677)
fanny-jiang Jan 30, 2025
b0dc898
Fix
hmahmood Jan 30, 2025
4a8cf81
Merge remote-tracking branch 'origin/main' into hasan.mahmood/system-…
hmahmood Jan 30, 2025
e1c96c2
Enable apparmor profile on gke autopilot for system-probe
hmahmood Feb 6, 2025
90b0c78
Enable securityContext on gke autopilot
hmahmood Feb 6, 2025
eead5ac
Merge remote-tracking branch 'origin/main' into hasan.mahmood/system-…
hmahmood Feb 6, 2025
700f22f
Fix npm for autopilot and GDC (#1679)
fanny-jiang Feb 7, 2025
50f0ebe
Fix
hmahmood Feb 7, 2025
0836fe5
fix changelog and add note about required GKE version
fanny-jiang Feb 10, 2025
d687f18
Merge branch 'main' into hasan.mahmood/system-probe-autopilot
fanny-jiang Feb 10, 2025
099687d
update baselines
fanny-jiang Feb 10, 2025
b6875e3
fix changelog
fanny-jiang Feb 10, 2025
ecbb147
Merge remote-tracking branch 'origin/main' into hasan.mahmood/system-…
hmahmood Feb 24, 2025
021c926
Revert unnecessary changes
hmahmood Feb 24, 2025
6ac2dd7
Update helm docs
hmahmood Feb 24, 2025
f4f4bbc
Update min version
hmahmood Feb 25, 2025
ee3c93b
Fix tests
hmahmood Feb 25, 2025
12e19ca
Merge branch 'main' into hasan.mahmood/system-probe-autopilot
fanny-jiang Feb 27, 2025
1259c72
Merge branch 'main' into hasan.mahmood/system-probe-autopilot
fanny-jiang Feb 27, 2025
64df0bc
Handle older GKE versions (#1720)
fanny-jiang Feb 28, 2025
3ff2fa1
Merge branch 'main' into hasan.mahmood/system-probe-autopilot
fanny-jiang Feb 28, 2025
caf9e65
bump chart version
fanny-jiang Feb 28, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions charts/datadog/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Datadog changelog

## 3.100.0

* Enable `system-probe` container on GKE Autopilot (requires GKE 1.32.1-gke.1729000 or later).

## 3.99.0

* Upgrade default Agent version to `7.63.2`.
Expand Down
2 changes: 1 addition & 1 deletion charts/datadog/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
apiVersion: v1
name: datadog
version: 3.99.0
version: 3.100.0
appVersion: "7"
description: Datadog Agent
keywords:
Expand Down
2 changes: 1 addition & 1 deletion charts/datadog/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Datadog

![Version: 3.99.0](https://img.shields.io/badge/Version-3.99.0-informational?style=flat-square) ![AppVersion: 7](https://img.shields.io/badge/AppVersion-7-informational?style=flat-square)
![Version: 3.100.0](https://img.shields.io/badge/Version-3.100.0-informational?style=flat-square) ![AppVersion: 7](https://img.shields.io/badge/AppVersion-7-informational?style=flat-square)

[Datadog](https://www.datadoghq.com/) is a hosted infrastructure monitoring platform. This chart adds the Datadog Agent to all nodes in your cluster via a DaemonSet. It also optionally depends on the [kube-state-metrics chart](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-state-metrics). For more information about monitoring Kubernetes with Datadog, please refer to the [Datadog documentation website](https://docs.datadoghq.com/agent/basic_agent_usage/kubernetes/).

Expand Down
2 changes: 2 additions & 0 deletions charts/datadog/ci/gke-autopilot-cri-less-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
datadog:
apiKey: "00000000000000000000000000000000"
appKey: "0000000000000000000000000000000000000000"
envDict:
DD_CI: true

logs:
enabled: true
Expand Down
2 changes: 2 additions & 0 deletions charts/datadog/ci/gke-autopilot-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ providers:
datadog:
apiKey: "00000000000000000000000000000000"
appKey: "0000000000000000000000000000000000000000"
envDict:
DD_CI: true

logs:
enabled: true
Expand Down
30 changes: 17 additions & 13 deletions charts/datadog/templates/NOTES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -337,7 +337,7 @@ You are using datadog.orchestratorExplorer.enabled but you disabled the cluster
To enable it please set clusterAgent.enabled to 'true'.
{{- end }}

{{- if .Values.providers.gke.autopilot}}
{{- if and (.Values.providers.gke.autopilot) (not .Values.datadog.envDict.DD_CI)}}

###########################################################################################
#### WARNING: Only one Datadog chart release allowed by namespace on GKE Autopilot ####
Expand All @@ -347,12 +347,12 @@ On GKE Autopilot, only one "datadog" Helm chart release is allowed by Kubernetes
* The serviceAccountName must be "datadog-agent".
* All ConfigMap names mounted must be hardcode.

{{- if eq (include "system-probe-feature" .) "true" }}
{{- if and (eq (include "system-probe-feature" .) "true") (eq (include "gke-autopilot-workloadallowlists-enabled" .) "false") }}

#####################################################################
#### WARNING: System Probe is not supported on GKE Autopilot ####
#####################################################################
{{- fail "On GKE Autopilot environments, System Probe is not supported. The option 'datadog.securityAgent.runtime.enabled', 'datadog.securityAgent.runtime.fimEnabled', 'datadog.networkMonitoring.enabled', 'datadog.systemProbe.enableTCPQueueLength', 'datadog.systemProbe.enableOOMKill', 'datadog.serviceMonitoring.enabled' and 'datadog.discovery.enabled' must be set 'false'" }}
##############################################################################################
#### WARNING: System Probe on GKE Autopilot requires GKE v1.32.1-gke.1729000 or later ####
##############################################################################################
{{- fail "System Probe on GKE Autopilot environments requires GKE v1.32.1-gke.1729000 or later. The option 'datadog.securityAgent.runtime.enabled', 'datadog.securityAgent.runtime.fimEnabled', 'datadog.networkMonitoring.enabled', 'datadog.systemProbe.enableTCPQueueLength', 'datadog.systemProbe.enableOOMKill', 'datadog.serviceMonitoring.enabled' and 'datadog.discovery.enabled' must be set 'false'" }}

{{- end }}

Expand Down Expand Up @@ -412,27 +412,31 @@ The option is overriden to avoid mounting volumes that are not allowed which wou

{{- end }}

{{- if .Values.datadog.networkMonitoring.enabled }}
{{- end }}

{{- if or .Values.providers.gke.autopilot .Values.providers.gke.gdc }}

{{- if or .Values.datadog.sbom.containerImage.enabled .Values.datadog.sbom.host.enabled }}

#######################################################################################
#### WARNING: Network Performance Monitoring is not supported on GKE Autopilot ####
#### WARNING: SBOM Monitoring is not supported on GKE Autopilot ####
#######################################################################################

{{- fail "On GKE Autopilot environments, Network Performance Monitoring is not supported. The option 'datadog.networkMonitoring.enabled' must be set to 'false'" }}
On GKE Autopilot environments, SBOM Monitoring is not supported. The options 'datadog.sbom.containerImage.enabled' and 'datadog.sbom.host.enabled' must be set to 'false'.

{{- end }}

{{- end }}

{{- if or .Values.providers.gke.autopilot .Values.providers.gke.gdc }}
{{- if .Values.providers.gke.gdc }}

{{- if or .Values.datadog.sbom.containerImage.enabled .Values.datadog.sbom.host.enabled }}
{{- if .Values.datadog.networkMonitoring.enabled }}

#######################################################################################
#### WARNING: SBOM Monitoring is not supported on GKE Autopilot ####
#### WARNING: Network Performance Monitoring is not supported on GKE GDC ####
#######################################################################################

On GKE Autopilot environments, SBOM Monitoring is not supported. The options 'datadog.sbom.containerImage.enabled' and 'datadog.sbom.host.enabled' must be set to 'false'.
{{- fail "On GKE GDC environments, Network Performance Monitoring is not supported. The option 'datadog.networkMonitoring.enabled' must be set to 'false'" }}

{{- end }}

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
{{- define "linux-container-host-release-volumemounts" -}}
{{- if or .Values.datadog.osReleasePath .Values.datadog.systemProbe.osReleasePath }}
{{- if and (not .Values.providers.gke.gdc) (not .Values.providers.gke.autopilot) }}
{{- if eq (include "should-add-host-path-for-os-release-file" .) "true" }}
{{- if eq (include "should-enable-system-probe" .) "true" }}
- name: os-release-file
mountPath: /host{{ .Values.datadog.systemProbe.osReleasePath | default .Values.datadog.osReleasePath }}
Expand All @@ -12,4 +11,3 @@
{{- end }}
{{- end }}
{{- end }}
{{- end }}
4 changes: 3 additions & 1 deletion charts/datadog/templates/_container-system-probe.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
image: "{{ include "image-path" (dict "root" .Values "image" .Values.agents.image) }}"
imagePullPolicy: {{ .Values.agents.image.pullPolicy }}
{{ include "generate-security-context" (dict "securityContext" .Values.agents.containers.systemProbe.securityContext "targetSystem" .Values.targetSystem "seccomp" .Values.datadog.systemProbe.seccomp "kubeversion" .Capabilities.KubeVersion.Version) | indent 2 }}
command: ["/opt/datadog-agent/embedded/bin/system-probe", "--config=/etc/datadog-agent/system-probe.yaml"]
command: ["system-probe", "--config=/etc/datadog-agent/system-probe.yaml"]
{{- if .Values.agents.containers.systemProbe.ports }}
ports:
{{ toYaml .Values.agents.containers.systemProbe.ports | indent 2 }}
Expand All @@ -30,9 +30,11 @@
resources:
{{ toYaml .Values.agents.containers.systemProbe.resources | indent 4 }}
volumeMounts:
{{- if (not .Values.providers.gke.autopilot) }}
- name: auth-token
mountPath: {{ template "datadog.confPath" . }}/auth
readOnly: true
{{- end }}
- name: logdatadog
mountPath: {{ template "datadog.logDirectoryPath" . }}
readOnly: false # Need RW to write logs
Expand Down
4 changes: 0 additions & 4 deletions charts/datadog/templates/_containers-init-linux.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
{{- define "containers-init-linux" -}}
- name: init-volume
{{- if not .Values.providers.gke.autopilot }}
{{- include "generate-security-context" (dict "securityContext" .Values.agents.containers.initContainers.securityContext "targetSystem" .Values.targetSystem "seccomp" "" "kubeversion" .Capabilities.KubeVersion.Version) | indent 2 }}
{{- end }}
image: "{{ include "image-path" (dict "root" .Values "image" .Values.agents.image) }}"
imagePullPolicy: {{ .Values.agents.image.pullPolicy }}
command: ["bash", "-c"]
Expand All @@ -15,9 +13,7 @@
resources:
{{ toYaml .Values.agents.containers.initContainers.resources | indent 4 }}
- name: init-config
{{- if not .Values.providers.gke.autopilot }}
{{- include "generate-security-context" (dict "securityContext" .Values.agents.containers.initContainers.securityContext "targetSystem" .Values.targetSystem "seccomp" "" "kubeversion" .Capabilities.KubeVersion.Version) | indent 2 }}
{{- end }}
image: "{{ include "image-path" (dict "root" .Values "image" .Values.agents.image) }}"
imagePullPolicy: {{ .Values.agents.image.pullPolicy }}
command:
Expand Down
2 changes: 1 addition & 1 deletion charts/datadog/templates/_daemonset-volumes-linux.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
- hostPath:
path: /sys/fs/cgroup
name: cgroups
{{- if and (not .Values.providers.gke.autopilot) (or .Values.datadog.systemProbe.osReleasePath .Values.datadog.osReleasePath .Values.datadog.sbom.host.enabled) }}
{{- if eq (include "should-add-host-path-for-os-release-file" .) "true"}}
- hostPath:
path: {{ .Values.datadog.systemProbe.osReleasePath | default .Values.datadog.osReleasePath }}
name: os-release-file
Expand Down
37 changes: 32 additions & 5 deletions charts/datadog/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ false
Check if target cluster is running GKE Autopilot.
*/}}
{{- define "is-autopilot" -}}
{{- if .Values.providers.gke.autopilot -}}
{{- $nodes := (lookup "v1" "Node" "" "").items }}
{{- if and $nodes (gt (len $nodes) 0) -}}
{{- $node := index $nodes 0 -}}
Expand All @@ -64,6 +65,9 @@ false
{{- else -}}
false
{{- end -}}
{{- else -}}
false
{{- end -}}
{{- end -}}

{{/*
Expand Down Expand Up @@ -374,7 +378,7 @@ false
Return true if the system-probe container should be created.
*/}}
{{- define "should-enable-system-probe" -}}
{{- if and (not (or .Values.providers.gke.autopilot .Values.providers.gke.gdc )) (eq (include "system-probe-feature" .) "true") (eq .Values.targetSystem "linux") -}}
{{- if or (and (eq (include "system-probe-feature" .) "true") (eq .Values.targetSystem "linux") (not .Values.providers.gke.gdc)) (eq (include "gke-autopilot-workloadallowlists-enabled" . ) "true") -}}
true
{{- else -}}
false
Expand Down Expand Up @@ -419,7 +423,8 @@ false
Return true if the security-agent container should be created.
*/}}
{{- define "should-enable-security-agent" -}}
{{- if and (not (or .Values.providers.gke.autopilot .Values.providers.gke.gdc )) (eq .Values.targetSystem "linux") (eq (include "security-agent-feature" .) "true") -}}
{{- if and (not .Values.providers.gke.gdc ) (eq .Values.targetSystem "linux") (eq (include "security-agent-feature"
.) "true") -}}
true
{{- else -}}
false
Expand All @@ -441,7 +446,7 @@ false
Return true if the runtime security features should be enabled.
*/}}
{{- define "should-enable-runtime-security" -}}
{{- if and (not (or .Values.providers.gke.autopilot .Values.providers.gke.gdc)) (or .Values.datadog.securityAgent.runtime.enabled .Values.datadog.securityAgent.runtime.fimEnabled) -}}
{{- if and (not .Values.providers.gke.gdc) (or .Values.datadog.securityAgent.runtime.enabled .Values.datadog.securityAgent.runtime.fimEnabled) -}}
true
{{- else -}}
false
Expand Down Expand Up @@ -1028,7 +1033,6 @@ Create RBACs for custom resources
false
{{- end -}}
{{- end -}}

{{/*
Return true if any process-related check is enabled
*/}}
Expand Down Expand Up @@ -1058,7 +1062,7 @@ Create RBACs for custom resources
Returns true if process-related checks should run on the core agent.
*/}}
{{- define "should-run-process-checks-on-core-agent" -}}
{{- if or .Values.providers.gke.gdc .Values.providers.gke.autopilot -}}
{{- if or (.Values.providers.gke.gdc) (and (.Values.providers.gke.autopilot) (not (eq (include "gke-autopilot-workloadallowlists-enabled" .) "true"))) -}}
false
{{- else if ne .Values.targetSystem "linux" -}}
false
Expand Down Expand Up @@ -1099,13 +1103,36 @@ Create RBACs for custom resources
{{- end -}}
{{- end -}}

{{/*
Returns true if Host path for os-release-file needs to be added to the volumes.
*/}}
{{- define "should-add-host-path-for-os-release-file" -}}
{{- if .Values.providers.gke.gdc -}}
false
{{- end }}
{{- if or .Values.datadog.systemProbe.osReleasePath .Values.datadog.osReleasePath .Values.datadog.sbom.host.enabled -}}
{{- if .Values.providers.gke.autopilot -}}
{{- if eq (include "gke-autopilot-workloadallowlists-enabled" .) "true" -}}
true
{{- else -}}
false
{{- end -}}
{{- else -}}
true
{{- end -}}
{{- else -}}
false
{{- end -}}
{{- end -}}

{{/*
Returns true if Host paths for default OS Release Paths need to be added to the volumes.
*/}}
{{- define "should-add-host-path-for-os-release-paths" -}}
{{- if ne .Values.targetSystem "linux" -}}
false
{{- else if .Values.providers.gke.autopilot -}}
false
{{- else if .Values.providers.talos.enabled -}}
false
{{- else if (and .Values.datadog.systemProbe.enableDefaultOsReleasePaths (not .Values.datadog.disableDefaultOsReleasePaths)) -}}
Expand Down
2 changes: 0 additions & 2 deletions charts/datadog/templates/_system-probe-init.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
{{- define "system-probe-init" -}}
- name: seccomp-setup
{{- if not .Values.providers.gke.autopilot }}
{{ include "generate-security-context" (dict "securityContext" .Values.agents.containers.initContainers.securityContext "targetSystem" .Values.targetSystem "seccomp" "" "kubeversion" .Capabilities.KubeVersion.Version) | indent 2 }}
{{- end }}
image: "{{ include "image-path" (dict "root" .Values "image" .Values.agents.image) }}"
imagePullPolicy: {{ .Values.agents.image.pullPolicy }}
command:
Expand Down
2 changes: 1 addition & 1 deletion charts/datadog/templates/daemonset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ spec:
checksum/agent-config: {{ tpl (toYaml .Values.agents.customAgentConfig) . | sha256sum }}
{{- end }}
{{- if eq (include "should-enable-system-probe" .) "true" }}
{{- if .Values.agents.podSecurity.apparmor.enabled }}
{{- if and (.Values.agents.podSecurity.apparmor.enabled) }}
container.apparmor.security.beta.kubernetes.io/system-probe: {{ .Values.datadog.systemProbe.apparmor }}
{{- end }}
{{- if semverCompare "<1.19.0" .Capabilities.KubeVersion.Version }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ apiVersion: auto.gke.io/v1
kind: AllowlistSynchronizer
metadata:
name: datadog-synchronizer
annotations:
helm.sh/hook: "pre-install,pre-upgrade"
spec:
allowlistPaths:
- Datadog/datadog/datadog-datadog-daemonset-exemption-v1.0.1.yaml
Expand Down
1 change: 1 addition & 0 deletions test/datadog/autopilot_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ func Test_autopilotConfigs(t *testing.T) {
ShowOnly: []string{"templates/daemonset.yaml"},
Values: []string{"../../charts/datadog/values.yaml"},
Overrides: map[string]string{
"DD_CI": "true",
"datadog.apiKeyExistingSecret": "datadog-secret",
"datadog.appKeyExistingSecret": "datadog-secret",
"providers.gke.autopilot": "true",
Expand Down
Loading