Skip to content

Commit

Permalink
[datadog] Show error notices about port discrepancies (#64)
Browse files Browse the repository at this point in the history
* [datadog] Show error notices about port discrepancies

- Show notices for APM and Cluster Agent.
- Remove explicitly set default port values from the liveness and readiness probe values to avoid possible misconfiguration.
- Use mergeOverwrite to merge default health ports from `datadog.apm.port` and `clusterAgent.healthPort`

* Flag node agent port discrepancy as well
  • Loading branch information
xornivore authored Oct 21, 2020
1 parent 87cd763 commit 3015738
Show file tree
Hide file tree
Showing 8 changed files with 80 additions and 31 deletions.
4 changes: 4 additions & 0 deletions charts/datadog/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Datadog changelog

## 2.4.27

* Remove port defaults from liveness/readiness probes and show error notices on misconfiguration if user overrides are supplying custom node settings.

## 2.4.26

* Revert to Helm2 hash in `requirements.yaml` to retain compatibility with Helm 2
Expand Down
2 changes: 1 addition & 1 deletion charts/datadog/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
apiVersion: v1
name: datadog
version: 2.4.26
version: 2.4.27
appVersion: "7"
description: Datadog Agent
keywords:
Expand Down
9 changes: 5 additions & 4 deletions charts/datadog/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Datadog

![Version: 2.4.26](https://img.shields.io/badge/Version-2.4.26-informational?style=flat-square) ![AppVersion: 7](https://img.shields.io/badge/AppVersion-7-informational?style=flat-square)
![Version: 2.4.27](https://img.shields.io/badge/Version-2.4.27-informational?style=flat-square) ![AppVersion: 7](https://img.shields.io/badge/AppVersion-7-informational?style=flat-square)

[Datadog](https://www.datadoghq.com/) is a hosted infrastructure monitoring platform. This chart adds the Datadog Agent to all nodes in your cluster via a DaemonSet. It also optionally depends on the [kube-state-metrics chart](https://github.com/kubernetes/charts/tree/master/stable/kube-state-metrics). For more information about monitoring Kubernetes with Datadog, please refer to the [Datadog documentation website](https://docs.datadoghq.com/agent/basic_agent_usage/kubernetes/).

Expand Down Expand Up @@ -310,6 +310,7 @@ helm install --name <RELEASE_NAME> \
|-----|------|---------|-------------|
| agents.affinity | object | `{}` | Allow the DaemonSet to schedule using affinity rules |
| agents.containers.agent.env | list | `[]` | Additional environment variables for the agent container |
| agents.containers.agent.healthPort | int | `5555` | Port number to use in the node agent for the healthz endpoint |
| agents.containers.agent.livenessProbe | object | Every 15s / 6 KO / 1 OK | Override default agent liveness probe settings |
| agents.containers.agent.logLevel | string | `nil` | Set logging verbosity, valid log levels are: trace, debug, info, warn, error, critical, and off |
| agents.containers.agent.readinessProbe | object | Every 15s / 6 KO / 1 OK | Override default agent readiness probe settings |
Expand Down Expand Up @@ -368,12 +369,12 @@ helm install --name <RELEASE_NAME> \
| clusterAgent.dnsConfig | object | `{}` | Specify dns configuration options for datadog cluster agent containers e.g ndots |
| clusterAgent.enabled | bool | `false` | Set this to true to enable Datadog Cluster Agent |
| clusterAgent.env | list | `[]` | Set environment variables specific to Cluster Agent |
| clusterAgent.healthPort | int | `5555` | Port number use the cluster-agent to server healthz endpoint |
| clusterAgent.healthPort | int | `5555` | Port number to use in the Cluster Agent for the healthz endpoint |
| clusterAgent.image.pullPolicy | string | `"IfNotPresent"` | Cluster Agent image pullPolicy |
| clusterAgent.image.pullSecrets | list | `[]` | Cluster Agent repository pullSecret (ex: specify docker registry credentials) |
| clusterAgent.image.repository | string | `"datadog/cluster-agent"` | Cluster Agent image repository to use |
| clusterAgent.image.tag | string | `"1.9.0"` | Cluster Agent image tag to use |
| clusterAgent.livenessProbe | object | Every 15s / 6 KO / 1 OK | Override default agent liveness probe settings |
| clusterAgent.livenessProbe | object | Every 15s / 6 KO / 1 OK | Override default Cluster Agent liveness probe settings |
| clusterAgent.metricsProvider.aggregator | string | `"avg"` | Define the aggregator the cluster agent will use to process the metrics. The options are (avg, min, max, sum) |
| clusterAgent.metricsProvider.createReaderRbac | bool | `true` | Create `external-metrics-reader` RBAC automatically (to allow HPA to read data from Cluster Agent) |
| clusterAgent.metricsProvider.enabled | bool | `false` | Set this to true to enable Metrics Provider |
Expand All @@ -387,7 +388,7 @@ helm install --name <RELEASE_NAME> \
| clusterAgent.priorityClassName | string | `nil` | Name of the priorityClass to apply to the Cluster Agent |
| clusterAgent.rbac.create | bool | `true` | If true, create & use RBAC resources |
| clusterAgent.rbac.serviceAccountName | string | `"default"` | Specify service account name to use (usually pre-existing, created if create is true) |
| clusterAgent.readinessProbe | object | Every 15s / 6 KO / 1 OK | Override default cluster-agent readiness probe settings |
| clusterAgent.readinessProbe | object | Every 15s / 6 KO / 1 OK | Override default Cluster Agent readiness probe settings |
| clusterAgent.replicas | int | `1` | Specify the of cluster agent replicas, if > 1 it allow the cluster agent to work in HA mode. |
| clusterAgent.resources | object | `{}` | Datadog cluster-agent resource requests and limits. |
| clusterAgent.strategy | object | `{"rollingUpdate":{"maxSurge":1,"maxUnavailable":0},"type":"RollingUpdate"}` | Allow the Cluster Agent deployment to perform a rolling update on helm update |
Expand Down
56 changes: 54 additions & 2 deletions charts/datadog/templates/NOTES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,27 @@ Then run:
--set datadog.apiKey=YOUR-KEY-HERE stable/datadog
{{- end }}

{{- $healthPort := .Values.agents.containers.agent.healthPort }}
{{- with $liveness := .Values.agents.containers.agent.livenessProbe.httpGet }}
{{- if and $liveness.port (ne $healthPort $liveness.port) }}

##############################################################################
#### ERROR: Node Agent liveness probe misconfiguration ####
##############################################################################

Node Agent liveness probe port ({{ $liveness.port }}) is different from the configured health port ({{ $healthPort }}).
{{- end }}
{{- end }}
{{- with $readiness := .Values.agents.containers.agent.readinessProbe.httpGet }}
{{- if and $readiness.port (ne $healthPort $readiness.port) }}

##############################################################################
#### ERROR: Node Agent readiness probe misconfiguration ####
##############################################################################

Node Agent readiness probe port ({{ $readiness.port }}) is different from the configured health port ({{ $healthPort }}).
{{- end }}
{{- end }}
{{- if .Values.clusterAgent.enabled }}

{{- if .Values.clusterAgent.metricsProvider.enabled }}
Expand All @@ -51,11 +72,42 @@ This deployment will be incomplete until you get your APP key from Datadog.
Create an application key at https://app.datadoghq.com/account/settings#api
{{- end }}
{{- end }}
{{- $healthPort := .Values.clusterAgent.healthPort }}
{{- with $liveness := .Values.clusterAgent.livenessProbe.httpGet }}
{{- if and $liveness.port (ne $healthPort $liveness.port) }}

{{- end }}
##############################################################################
#### ERROR: Cluster Agent liveness probe misconfiguration ####
##############################################################################

Cluster Agent liveness probe port ({{ $liveness.port }}) is different from the configured health port ({{ $healthPort }}).
{{- end }}
{{- end }}
{{- with $readiness := .Values.clusterAgent.readinessProbe.httpGet }}
{{- if and $readiness.port (ne $healthPort $readiness.port) }}

##############################################################################
#### ERROR: Cluster Agent readiness probe misconfiguration ####
##############################################################################

Cluster Agent readiness probe port ({{ $readiness.port }}) is different from the configured health port ({{ $healthPort }}).
{{- end }}
{{- end }}
{{- end }}
{{- if .Values.datadog.apm.enabled }}
The Datadog Agent is listening on port {{ .Values.datadog.apm.port }} for APM service.
{{- $apmPort := .Values.datadog.apm.port }}
{{- with $liveness := .Values.agents.containers.traceAgent.livenessProbe.tcpSocket }}
{{- if and $liveness.port (ne $apmPort $liveness.port) }}

##############################################################################
#### ERROR: Trace Agent liveness probe misconfiguration ####
##############################################################################

Trace Agent liveness probe port ({{ $liveness.port }}) is different from the configured APM port ({{ $apmPort }}).
{{- end }}
{{- end }}

The Datadog Agent is listening on port {{ $apmPort }} for APM service.
{{- end }}

{{- if .Values.datadog.autoconf }}
Expand Down
6 changes: 4 additions & 2 deletions charts/datadog/templates/cluster-agent-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -199,9 +199,11 @@ spec:
{{ toYaml .Values.clusterAgent.env | indent 10 }}
{{- end }}
livenessProbe:
{{ toYaml .Values.clusterAgent.livenessProbe | indent 10 }}
{{ $defaultLive := dict "httpGet" (dict "port" .Values.clusterAgent.healthPort "path" "/live" "scheme" "HTTP") }}
{{ toYaml (mergeOverwrite $defaultLive .Values.clusterAgent.livenessProbe) | indent 10 }}
readinessProbe:
{{ toYaml .Values.clusterAgent.readinessProbe | indent 10 }}
{{ $defaultReady := dict "httpGet" (dict "port" .Values.clusterAgent.healthPort "path" "/ready" "scheme" "HTTP") }}
{{ toYaml (mergeOverwrite $defaultReady .Values.clusterAgent.readinessProbe) | indent 10 }}
volumeMounts:
- name: installinfo
subPath: install_info
Expand Down
6 changes: 4 additions & 2 deletions charts/datadog/templates/container-agent.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,9 @@
{{ toYaml .Values.agents.volumeMounts | indent 4 }}
{{- end }}
livenessProbe:
{{ toYaml .Values.agents.containers.agent.livenessProbe | indent 4 }}
{{ $defaultLive := dict "httpGet" (dict "port" .Values.agents.containers.agent.healthPort "path" "/live" "scheme" "HTTP") }}
{{ toYaml (mergeOverwrite $defaultLive .Values.agents.containers.agent.livenessProbe) | indent 4 }}
readinessProbe:
{{ toYaml .Values.agents.containers.agent.readinessProbe | indent 4 }}
{{ $defaultReady := dict "httpGet" (dict "port" .Values.agents.containers.agent.healthPort "path" "/ready" "scheme" "HTTP") }}
{{ toYaml (mergeOverwrite $defaultReady .Values.agents.containers.agent.readinessProbe) | indent 4 }}
{{- end -}}
3 changes: 2 additions & 1 deletion charts/datadog/templates/container-trace-agent.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,5 +59,6 @@
mountPath: {{ (dir .Values.datadog.apm.socketPath) }}
{{- end }}
livenessProbe:
{{ toYaml .Values.agents.containers.traceAgent.livenessProbe | indent 4 }}
{{ $defaultLive := dict "tcpSocket" (dict "port" .Values.datadog.apm.port) }}
{{ toYaml (mergeOverwrite $defaultLive .Values.agents.containers.traceAgent.livenessProbe) | indent 4 }}
{{- end -}}
25 changes: 6 additions & 19 deletions charts/datadog/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -427,29 +427,21 @@ clusterAgent:
## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
affinity: {}

# clusterAgent.healthPort -- Port number use the cluster-agent to server healthz endpoint
# clusterAgent.healthPort -- Port number to use in the Cluster Agent for the healthz endpoint
healthPort: 5555

# clusterAgent.livenessProbe -- Override default agent liveness probe settings
# clusterAgent.livenessProbe -- Override default Cluster Agent liveness probe settings
# @default -- Every 15s / 6 KO / 1 OK
livenessProbe:
httpGet:
port: 5555
path: /live
scheme: HTTP
initialDelaySeconds: 15
periodSeconds: 15
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 6

# clusterAgent.readinessProbe -- Override default cluster-agent readiness probe settings
# clusterAgent.readinessProbe -- Override default Cluster Agent readiness probe settings
# @default -- Every 15s / 6 KO / 1 OK
readinessProbe:
httpGet:
port: 5555
path: /ready
scheme: HTTP
initialDelaySeconds: 15
periodSeconds: 15
timeoutSeconds: 5
Expand Down Expand Up @@ -626,12 +618,12 @@ agents:
# cpu: 200m
# memory: 256Mi

# agents.containers.agent.healthPort -- Port number to use in the node agent for the healthz endpoint
healthPort: 5555

# agents.containers.agent.livenessProbe -- Override default agent liveness probe settings
# @default -- Every 15s / 6 KO / 1 OK
livenessProbe:
httpGet:
path: /live
port: 5555
initialDelaySeconds: 15
periodSeconds: 15
timeoutSeconds: 5
Expand All @@ -641,9 +633,6 @@ agents:
# agents.containers.agent.readinessProbe -- Override default agent readiness probe settings
# @default -- Every 15s / 6 KO / 1 OK
readinessProbe:
httpGet:
path: /ready
port: 5555
initialDelaySeconds: 15
periodSeconds: 15
timeoutSeconds: 5
Expand Down Expand Up @@ -685,8 +674,6 @@ agents:
# agents.containers.traceAgent.livenessProbe -- Override default agent liveness probe settings
# @default -- Every 15s
livenessProbe:
tcpSocket:
port: 8126
initialDelaySeconds: 15
periodSeconds: 15
timeoutSeconds: 5
Expand Down

0 comments on commit 3015738

Please sign in to comment.