This module deploys Grafana Alloy to collect metrics/traces/logs from various sources in a Kubernetes cluster.
The module is designed for flexible deployment of Grafana Alloy with different configurations:
- Cluster Module - Collects metrics from Kubernetes cluster (pods, services, kubelet, cAdvisor)
- Node Module - Collects node-level metrics using node_exporter
- Kafka Module - Collects JMX metrics from Kafka brokers
- AWS Module - Collects metrics from AWS services via CloudWatch
- Single Module - Collects traces and metrics using OpenTelemetry protocol, Prometheus Alert rules which needs to be single point of processing
- OpenTelemetry Collector Module - Collects telemetry data (traces and metrics) using the OpenTelemetry protocol and forwards them to Grafana Tempo and Mimir backends
- Loki Logs Module - Collects logs from Kubernetes pods and forwards them to Loki with support for annotation-based filtering and multi-tenancy
The module supports:
- Scaling to multiple replicas for high availability
- Clustering for load distribution
- Flexible configuration using River format
- Collection of metrics to Prometheus-compatible endpoints
- Collection of logs to Loki
- Collection of traces and metrics via OpenTelemetry protocol
- Support for OpenTelemetry Collector deployment and configuration
- Configurable resource limits for agents
The module contains the following submodules:
cluster
- For collecting Kubernetes metricsnode
- For collecting system metrics from nodeskafka
- For collecting Kafka JMX metricsaws
- For collecting AWS CloudWatch metricssingle
- For collecting OpenTelemetry traces and metrics, Prometheus Alert rules which needs to be single point of processingotel-collector
- For collecting OpenTelemetry traces and metrics using the OpenTelemetry Collector protocolloki-logs
- For collecting and forwarding Kubernetes pod logs to Loki
Each module can be used independently or in combination based on requirements.
module "grafana_alloy_k8s" {
source = "./modules/cluster"
kubernetes_cluster_name = "somecluster"
kubernetes_namespace = "cluster-apps"
agent_name = "clustered"
clustering_enabled = true
replicas = 3
config = [<<-EOF
k8s_pods "my" {
metrics_output = prometheus.remote_write.default.receiver
}
k8s_services "my" {
metrics_output = prometheus.remote_write.default.receiver
}
k8s_cadvisor "my" {
metrics_output = prometheus.remote_write.default.receiver
}
k8s_kubelet "my" {
metrics_output = prometheus.remote_write.default.receiver
}
EOF
]
metrics = {
endpoint = "https://mimir.example.com:443/api/v1/push"
}
}
module "grafana_alloy_otel" {
source = "./modules/otel-collector"
kubernetes_cluster_name = "somecluster"
kubernetes_namespace = "cluster-apps"
agent_name = "otel"
config = [<<-EOF
otel_process "my" {
metrics_output = prometheus.remote_write.default.receiver
traces_output = otelcol.exporter.otelhttp.default.receiver
}
EOF
]
metrics = {
endpoint = "https://mimir.example.com:443/api/v1/push"
}
otel = {
enabled = true
endpoint = "https://tempo.example.com:443"
}
}
NOTE: OTel components are not cluster-capable and some require single point of processing (ie. traces)
module "grafana_alloy_loki_logs" {
source = "./modules/loki-logs"
loki = {
url = "http://loki-gateway.monitoring.svc.cluster.local:80/loki/api/v1/push"
tenant_id = "default"
}
kubernetes_namespace = "monitoring"
kubernetes_cluster_name = "utils"
}
For working examples, look into the submodules
agent_resources = {
requests = {
cpu = "100m"
memory = "100Mi"
}
limits = {
cpu = "1"
memory = "1Gi"
}
}
Please note, when limits are undefined, requests values are used for limits too.
Name | Version |
---|---|
terraform | >= 1.3.0, < 2.0.0 |
helm | >= 2.0.0 |
kubernetes | >= 2.0.0 |
Name | Version |
---|---|
helm | >= 2.0.0 |
kubernetes | >= 2.0.0 |
No modules.
Name | Type |
---|---|
helm_release.grafana_alloy | resource |
kubernetes_config_map_v1.grafana_alloy | resource |
kubernetes_secret_v1.grafana_alloy | resource |
Name | Description | Type | Default | Required |
---|---|---|---|---|
agent_name | Name of the Grafana Alloy. | string |
n/a | yes |
agent_resources | Resources for the Grafana Alloy | object({ |
{} |
no |
chart_version | Helm chart version of Grafana Alloy | string |
"1.0.2" |
no |
clustering_enabled | Enable Grafana Alloy clustering. NOTE: This is only supported for certain kinds of resources - RTFM | bool |
false |
no |
config | Grafana Alloy River configuration. Some configuration should be provided. You're encouraged to use the provided templates. You can also provide your completely own config with default_config_enabled = false . |
list(string) |
[] |
no |
controller_resources | Resources for the Grafana Alloy controller | object({ |
{} |
no |
default_config_enabled | Enable default Grafana Alloy config templates. NOTE: Set this to false only if you want to use your own config without the enclosed templates. |
bool |
true |
no |
envs | Additional environment variables for the Grafana Alloy. You can use this attribute to provide additional secrets without exposing them in the config map output. | map(string) |
{} |
no |
global_tolerations | Global tolerations for the Grafana Alloy | list(object({ |
[] |
no |
host_volumes | Extra volumes to mount to the Grafana Alloy. This is needed for some integrations like node_exporter. | list(object({ |
[] |
no |
iam_role_arn | This role is for assuming by cloudwatch exporter | string |
"" |
no |
image | Image registry for Grafana Alloy. This is meant to be used with custom pull-through proxies/registries. | object({ |
{} |
no |
integrations | Grafana Alloy integrations configuration | object({ |
{} |
no |
k8s_pods | Grafana Alloy scrape settings for K8S pods | object({ |
{} |
no |
kafka_jmx_metrics | Grafana Alloy scrape JMX kafka metrics | object({ |
{} |
no |
kubernetes_cluster_name | Kubernetes cluster name. NOTE: This gets injected into labels/attributes of all collected data. | string |
n/a | yes |
kubernetes_kind | Grafana Alloy Kubernetes resource kind. Valid values are "deployment" or "daemonset". If you want to use clustering, you should use "deployment" with multiple replicas. | string |
"deployment" |
no |
kubernetes_namespace | Kubernetes namespace to deploy the Grafana Alloy into. NOTE: The namespace must exist and be available for deployment! | string |
n/a | yes |
kubernetes_security_context | Kubernetes security context configuration for the Grafana Alloy. This is needed with node_exporter to run privileged and as root (UID 0). | object({ |
{} |
no |
live_debug | Enable live debug for the Grafana Alloy | bool |
false |
no |
loki | Grafana Alloy scrape settings for Loki logs | object({ |
{} |
no |
metrics | Grafana Alloy metrics endpoint of Prometheus-compatible receiver. NOTE: You must provide the base URL of the API. | object({ |
{} |
no |
otel | Grafana Alloy OTel configuration. NOTE: There can be only one OTel receiver at the moment. | object({ |
{} |
no |
replicas | Number of Grafana Alloy replicas. NOTE: Only valid for kubernetes_kind = "deployment" . |
number |
1 |
no |
stability_level | n/a | string |
"generally-available" |
no |
Name | Description |
---|---|
otel_endpoints | Exposed OTel endpoints |