Monitoring with Prometheus

Monitor cloudflare-operator with Prometheus

Prerequisites

The easiest way to deploy all the necessary components is to use kube-prometheus-stack.

Enable metrics

In order to enable metrics and automatically deploy the required resources, you need to reconfigure the Helm chart.

Create a values.yaml file with the following content:

---
metrics:
  podMonitor:
    enabled: true
  prometheusRule:
    enabled: true

Now you can install / upgrade the Helm chart by following the installation guide.

Install cloudflare-operator Grafana dashboard

# Download Grafana dashboard
wget https://raw.githubusercontent.com/containeroo/cloudflare-operator/master/config/manifests/grafana/dashboards/overview.json -O /tmp/grafana-dashboard-cloudflare-operator.json

# Create the configmap
kubectl create configmap grafana-dashboard-cloudflare-operator --from-file=/tmp/grafana-dashboard-cloudflare-operator.json

# Add label so Grafana can fetch dashboard
kubectl label configmap grafana-dashboard-cloudflare-operator grafana_dashboard="1"

Available metrics

For each cloudflare-operator.io kind, the controller exposes a gauge metric to track the status condition.

Ready status metrics:

cloudflare_operator_account_status
cloudflare_operator_dns_record_status
cloudflare_operator_ip_status
cloudflare_operator_zone_status

Alerting

The following alerting rule can be used to monitor DNS record failures:

groups:
  - alert: DNSRecordFailures
    annotations:
      summary:
        DNSRecord {{ $labels.name }} ({{ $labels.record_name }}) in namespace
        {{ $labels.exported_namespace }} failed
    expr: cloudflare_operator_dns_record_status > 0
    for: 1m
    labels:
      severity: critical
Last modified August 11, 2023: feat: update cloudflare-operator docs (a02117f)