How to automate your Kubernetes Ingress monitoring with Prometheus

Table of Contents

Recently, I had the challenge of setting up monitoring for all the Ingress hosts in our cluster - which are A LOT. Since new hosts are added or removed all the time, I tried to find a way to automate the whole thing so that I don’t have to worry about it too much once it is initially set up and I am reliably notified when a failure occurs.

After some research I came to the conclusion that Prometheus is exactly what I am looking for. The Prometheus Operator makes it very easy to integrate the tool into a Kubernetes cluster, where it can access all Ingress objects via the API and create monitoring rules from them.

Installing Prometheus>

Installing Prometheus #

The first thing we need to do is to install the kube-prometheus-stack Helm Chart:

For the helm chart to work the way we want we need to define some values in a myvalues.yaml file. In this example we have configured Slack as an receiver for our alerts. Other available options include email, Telegram or Pagerduty. You can also use custom webhooks for receivers that aren’t natively supported, e.g. MS Teams. For a full list of available options please consult the Prometheus documentation.

If you want to configure additional things such as Grafana to visualize your monitoring please consult the charts default values.yaml file to add your own custom values, but here is a basic myvalues.yaml that does the job:

prometheus:
  prometheusSpec:
    podMonitorSelectorNilUsesHelmValues: false
    serviceMonitorSelectorNilUsesHelmValues: false
    ruleSelectorNilUsesHelmValues: false
    probeSelectorNilUsesHelmValues: false
alertmanager:
  config:
    # ref.: https://github.com/prometheus/alertmanager/blob/main/doc/examples/simple.yml
    route:
      group_by: ['alertname', 'cluster', 'service']
      # default receiver
      receiver: 'null'
      routes:
        - receiver: 'null'
          matchers:
            - alertname =~ "InfoInhibitor|Watchdog"
        - receiver: 'slack'
          matchers:
            - severity =~ "warning|critical"
    receivers:
      - name: 'null'
      - name: 'slack'
        slack_configs:
          - send_resolved: true
            api_url: '' # <- Your Slack webhook URL goes here
            channel: '#slack-channel'

If you are using Slack as a receiver I would also highly recommend you to check out this awesome alerting template: click me. We can inject the custom template file inside our Helm values via the alertmanager.templateFiles key. I am just not adding it in this example to keep it simple.

Once we’ve finished configuring our values we are ready to install the Chart:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install -f myvalues.yaml kube-prometheus-stack prometheus-community/kube-prometheus-stack

⚠️ In this example we will install kube-prometheus-stack in the default namespace but keep in mind that it is best practice to create a separate namespace for this.

Installing the Blackbox Exporter>

Installing the Blackbox Exporter #

Now that Prometheus is up and running we need to install the Blackbox Exporter which will execute the actual probes and responds with metrics that Prometheus will be able to understand and scrape:

helm install prometheus-blackbox-exporter prometheus-community/prometheus-blackbox-exporter

Configuring the Ingress probe>

Configuring the Ingress probe #

After that comes the interesting part: We need to tell Prometheus what it should monitor. For our Ingress hosts we need to create a Probe manifest:

apiVersion: monitoring.coreos.com/v1
kind: Probe
metadata:
  name: blackbox-kubernetes-ingress
spec:
  jobName: blackbox-kubernetes-ingress
  interval: 30s
  module: http_2xx
  prober:
    url: prometheus-blackbox-exporter:9115
    scheme: http
    path: /probe
  targets:
    ingress:
      namespaceSelector:
        any: true
      relabelingConfigs:
        - sourceLabels: [__param_target, __meta_kubernetes_ingress_annotation_monitoring_mycompany_com_health_probe_path] # custom Ingress Annotation: monitoring.mycompany.com/health-probe-path
          regex: (.*\w)(\/|);(.+) # matches target host as capture group $1, trailing slash (if applicable) as capture group $2 and health-probe-path annotation as capture group $3
          replacement: $1$3 # combine target host and health-probe-path annotation without trailing slash of target (leading slash in annotation is needed, e.g. "/healthz")
          targetLabel: __param_target

Here we have defined that the target for our probe should be an Ingress and then we use the namespaceSelector to select every Ingress in any namespace. There is also a relabelingConfig included as an example. relabelingConfigs are a powerful tool to customize your probes - in this example we’ve added the possibility to use an Ingress annotation to alter our health probe path. E.g. if your application has implemented a health check at /healthz you can simply use the Ingress annotation monitoring.mycompany.com/health-probe-path: '/healthz' to tell Prometheus that this path should be used instead of the Ingress root path. More information about relabelingConfigs can be found here.

Creating the Alert Rule>

Creating the Alert Rule #

Last but not least we need to tell Prometheus that it should create an alert that will be sent to our receiver if the probe fails:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: ingress-probe-failure
spec:
  groups:
    - name: ingress-probe-failure
      rules:
        - alert: IngressProbeFailure
          expr: probe_success{job="blackbox-kubernetes-ingress"} == 0
          for: 5m
          labels:
            severity: warning
          annotations:
            summary: "Ingress Host is down!"
            description: "The Ingress Host {{ $labels.instance }} ({{ $labels.ingress }} / {{ $labels.namespace }}) failed to respond with a valid status code for at least 5 minutes."

Now once an Ingress probe is unhealthy for at least 5 minutes it will fire an alert with severity warning which is routed to our Slack receiver as previously configured.

Please keep in mind that there are a lot more things you can configure & do in Prometheus and we are barely scratching the surface here, but that would be too much for this blog post.