Skip to content

Commit

Permalink
[datadog] Refactor liveness and readiness probes (#69)
Browse files Browse the repository at this point in the history
* [datadog] Refactor liveness and readiness probes

* Address review feedback
  • Loading branch information
xornivore authored Oct 26, 2020
1 parent d83db4d commit 46e0517
Show file tree
Hide file tree
Showing 10 changed files with 84 additions and 26 deletions.
7 changes: 7 additions & 0 deletions charts/datadog/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# Datadog changelog

## 2.4.30

* Refactor liveness and readiness probes with helpers to allow user overrides with other types of probes or disabling
probes entirely.
* Introduce `clusterChecksRunner.healthPort` default setting.
* Use health port defaults instead of hardcoded values.

## 2.4.29

* Add `common-env-vars` to `system-probe` container
Expand Down
2 changes: 1 addition & 1 deletion charts/datadog/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
apiVersion: v1
name: datadog
version: 2.4.29
version: 2.4.30
appVersion: "7"
description: Datadog Agent
keywords:
Expand Down
3 changes: 2 additions & 1 deletion charts/datadog/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Datadog

![Version: 2.4.29](https://img.shields.io/badge/Version-2.4.29-informational?style=flat-square) ![AppVersion: 7](https://img.shields.io/badge/AppVersion-7-informational?style=flat-square)
![Version: 2.4.30](https://img.shields.io/badge/Version-2.4.30-informational?style=flat-square) ![AppVersion: 7](https://img.shields.io/badge/AppVersion-7-informational?style=flat-square)

[Datadog](https://www.datadoghq.com/) is a hosted infrastructure monitoring platform. This chart adds the Datadog Agent to all nodes in your cluster via a DaemonSet. It also optionally depends on the [kube-state-metrics chart](https://github.com/kubernetes/charts/tree/master/stable/kube-state-metrics). For more information about monitoring Kubernetes with Datadog, please refer to the [Datadog documentation website](https://docs.datadoghq.com/agent/basic_agent_usage/kubernetes/).

Expand Down Expand Up @@ -402,6 +402,7 @@ helm install --name <RELEASE_NAME> \
| clusterChecksRunner.dnsConfig | object | `{}` | specify dns configuration options for datadog cluster agent containers e.g ndots |
| clusterChecksRunner.enabled | bool | `false` | If true, deploys agent dedicated for running the Cluster Checks instead of running in the Daemonset's agents. |
| clusterChecksRunner.env | list | `[]` | Environment variables specific to Cluster Checks Runner |
| clusterChecksRunner.healthPort | int | `5555` | Port number to use in the Cluster Checks Runner for the healthz endpoint |
| clusterChecksRunner.image.pullPolicy | string | `"IfNotPresent"` | Datadog Agent image pull policy |
| clusterChecksRunner.image.pullSecrets | list | `[]` | Datadog Agent repository pullSecret (ex: specify docker registry credentials) |
| clusterChecksRunner.image.repository | string | `"datadog/agent"` | Datadog Agent image repository to use |
Expand Down
23 changes: 23 additions & 0 deletions charts/datadog/templates/NOTES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,29 @@ Cluster Agent liveness probe port ({{ $liveness.port }}) is different from the c
Cluster Agent readiness probe port ({{ $readiness.port }}) is different from the configured health port ({{ $healthPort }}).
{{- end }}
{{- end }}
{{- if and .Values.datadog.clusterChecks.enabled .Values.clusterChecksRunner.enabled }}
{{- $healthPort := .Values.clusterChecksRunner.healthPort }}
{{- with $liveness := .Values.clusterChecksRunner.livenessProbe.httpGet }}
{{- if and $liveness.port (ne $healthPort $liveness.port) }}

#####################################################################################
#### ERROR: Cluster Checks Runner liveness probe misconfiguration ####
#####################################################################################

Cluster Checks Runner liveness probe port ({{ $liveness.port }}) is different from the configured health port ({{ $healthPort }}).
{{- end }}
{{- end }}
{{- with $readiness := .Values.clusterChecksRunner.readinessProbe.httpGet }}
{{- if and $readiness.port (ne $healthPort $readiness.port) }}

#####################################################################################
#### ERROR: Cluster Checks Runner readiness probe misconfiguration ####
#####################################################################################

Cluster Checks Runner readiness probe port ({{ $readiness.port }}) is different from the configured health port ({{ $healthPort }}).
{{- end }}
{{- end }}
{{- end }}
{{- end }}
{{- if .Values.datadog.apm.enabled }}
{{- $apmPort := .Values.datadog.apm.port }}
Expand Down
27 changes: 27 additions & 0 deletions charts/datadog/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -150,3 +150,30 @@ true
false
{{- end -}}
{{- end -}}

{{/*
Returns probe definition based on user settings and default HTTP port.
Accepts a map with `port` (default port), `path` (probe handler URI) and `settings` (probe settings).
*/}}
{{- define "probe.http" -}}
{{- if or .settings.httpGet .settings.tcpSocket .settings.exec -}}
{{ toYaml .settings }}
{{- else -}}
{{- $handler := dict "httpGet" (dict "port" .port "path" .path "scheme" "HTTP") -}}
{{ toYaml (merge $handler .settings) }}
{{- end -}}
{{- end -}}

{{/*
Returns probe definition based on user settings and default TCP socket port.
Accepts a map with `port` (default port) and `settings` (probe settings).
*/}}
{{- define "probe.tcp" -}}
{{- if or .settings.httpGet .settings.tcpSocket .settings.exec -}}
{{ toYaml .settings }}
{{- else -}}
{{- $handler := dict "tcpSocket" (dict "port" .port) -}}
{{- toYaml (merge $handler .settings) -}}
{{- end -}}
{{- end -}}

9 changes: 6 additions & 3 deletions charts/datadog/templates/agent-clusterchecks-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,8 @@ spec:
- name: DD_EXTRA_CONFIG_PROVIDERS
value: "clusterchecks"
- name: DD_HEALTH_PORT
value: "5555"
{{- $healthPort := .Values.clusterChecksRunner.healthPort }}
value: {{ $healthPort | quote }}
# Cluster checks
- name: DD_CLUSTER_AGENT_KUBERNETES_SERVICE_NAME
value: {{ template "datadog.fullname" . }}-cluster-agent
Expand Down Expand Up @@ -155,9 +156,11 @@ spec:
{{ toYaml .Values.clusterChecksRunner.volumeMounts | indent 10 }}
{{- end }}
livenessProbe:
{{ toYaml .Values.clusterChecksRunner.livenessProbe | indent 10 }}
{{- $live := .Values.clusterChecksRunner.livenessProbe }}
{{ include "probe.http" (dict "settings" $live "path" "/live" "port" $healthPort) | indent 10 }}
readinessProbe:
{{ toYaml .Values.clusterChecksRunner.readinessProbe | indent 10 }}
{{- $ready := .Values.clusterChecksRunner.readinessProbe }}
{{ include "probe.http" (dict "settings" $ready "path" "/ready" "port" $healthPort) | indent 10 }}
volumes:
- name: installinfo
configMap:
Expand Down
13 changes: 7 additions & 6 deletions charts/datadog/templates/cluster-agent-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,8 @@ spec:
{{- end }}
env:
- name: DD_HEALTH_PORT
value: {{ .Values.clusterAgent.healthPort | quote }}
{{- $healthPort := .Values.clusterAgent.healthPort }}
value: {{ $healthPort | quote }}
- name: DD_API_KEY
valueFrom:
secretKeyRef:
Expand Down Expand Up @@ -200,12 +201,12 @@ spec:
{{ toYaml .Values.clusterAgent.env | indent 10 }}
{{- end }}
livenessProbe:
{{ $defaultLive := dict "httpGet" (dict "port" .Values.clusterAgent.healthPort "path" "/live" "scheme" "HTTP") }}
{{ toYaml (mergeOverwrite $defaultLive .Values.clusterAgent.livenessProbe) | indent 10 }}
{{- $live := .Values.clusterAgent.livenessProbe }}
{{ include "probe.http" (dict "path" "/live" "port" $healthPort "settings" $live) | indent 10 }}
readinessProbe:
{{ $defaultReady := dict "httpGet" (dict "port" .Values.clusterAgent.healthPort "path" "/ready" "scheme" "HTTP") }}
{{ toYaml (mergeOverwrite $defaultReady .Values.clusterAgent.readinessProbe) | indent 10 }}
volumeMounts:
{{- $ready := .Values.clusterAgent.readinessProbe }}
{{ include "probe.http" (dict "path" "/ready" "port" $healthPort "settings" $ready) | indent 10 }}
volumeMounts:
- name: installinfo
subPath: install_info
{{- if eq .Values.targetSystem "windows" }}
Expand Down
13 changes: 6 additions & 7 deletions charts/datadog/templates/container-agent.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -68,10 +68,9 @@
value: {{ (default false (or .Values.datadog.logs.containerCollectAll .Values.datadog.logsConfigContainerCollectAll)) | quote}}
- name: DD_LOGS_CONFIG_K8S_CONTAINER_USE_FILE
value: {{ .Values.datadog.logs.containerCollectUsingFiles | quote }}
{{- if not .Values.datadog.livenessProbe }}
- name: DD_HEALTH_PORT
value: "5555"
{{- end }}
{{- $healthPort := .Values.agents.containers.agent.healthPort }}
value: {{ $healthPort | quote }}
{{- if .Values.datadog.dogstatsd.useSocketVolume }}
- name: DD_DOGSTATSD_SOCKET
value: {{ .Values.datadog.dogstatsd.socketPath | quote }}
Expand Down Expand Up @@ -165,9 +164,9 @@
{{ toYaml .Values.agents.volumeMounts | indent 4 }}
{{- end }}
livenessProbe:
{{ $defaultLive := dict "httpGet" (dict "port" .Values.agents.containers.agent.healthPort "path" "/live" "scheme" "HTTP") }}
{{ toYaml (mergeOverwrite $defaultLive .Values.agents.containers.agent.livenessProbe) | indent 4 }}
{{- $live := .Values.agents.containers.agent.livenessProbe }}
{{ include "probe.http" (dict "path" "/live" "port" $healthPort "settings" $live) | indent 4 }}
readinessProbe:
{{ $defaultReady := dict "httpGet" (dict "port" .Values.agents.containers.agent.healthPort "path" "/ready" "scheme" "HTTP") }}
{{ toYaml (mergeOverwrite $defaultReady .Values.agents.containers.agent.readinessProbe) | indent 4 }}
{{- $ready := .Values.agents.containers.agent.readinessProbe }}
{{ include "probe.http" (dict "path" "/ready" "port" $healthPort "settings" $ready) | indent 4 }}
{{- end -}}
4 changes: 2 additions & 2 deletions charts/datadog/templates/container-trace-agent.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,6 @@
mountPath: {{ (dir .Values.datadog.apm.socketPath) }}
{{- end }}
livenessProbe:
{{ $defaultLive := dict "tcpSocket" (dict "port" .Values.datadog.apm.port) }}
{{ toYaml (mergeOverwrite $defaultLive .Values.agents.containers.traceAgent.livenessProbe) | indent 4 }}
{{- $live := .Values.agents.containers.traceAgent.livenessProbe }}
{{ include "probe.tcp" (dict "port" .Values.datadog.apm.port "settings" $live ) | indent 4 }}
{{- end -}}
9 changes: 3 additions & 6 deletions charts/datadog/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -901,6 +901,9 @@ clusterChecksRunner:
#
tolerations: []

# clusterChecksRunner.healthPort -- Port number to use in the Cluster Checks Runner for the healthz endpoint
healthPort: 5555

# clusterChecksRunner.livenessProbe -- Override default agent liveness probe settings
# @default -- Every 15s / 6 KO / 1 OK
## In case of issues with the probe, you can disable it with the
Expand All @@ -911,9 +914,6 @@ clusterChecksRunner:
# command: ["/bin/true"]
#
livenessProbe:
httpGet:
path: /live
port: 5555
initialDelaySeconds: 15
periodSeconds: 15
timeoutSeconds: 5
Expand All @@ -930,9 +930,6 @@ clusterChecksRunner:
# command: ["/bin/true"]
#
readinessProbe:
httpGet:
path: /ready
port: 5555
initialDelaySeconds: 15
periodSeconds: 15
timeoutSeconds: 5
Expand Down

0 comments on commit 46e0517

Please sign in to comment.