How to automate your Kubernetes Ingress monitoring with Prometheus
Table of Contents
Recently, I had the challenge of setting up monitoring for all the Ingress hosts in our cluster - which are A LOT. Since new hosts are added or removed all the time, I tried to find a way to automate the whole thing so that I don’t have to worry about it too much once it is initially set up and I am reliably notified when a failure occurs.
After some research I came to the conclusion that Prometheus is exactly what I am looking for. The Prometheus Operator makes it very easy to integrate the tool into a Kubernetes cluster, where it can access all Ingress objects via the API and create monitoring rules from them.
Installing Prometheus #
The first thing we need to do is to install the kube-prometheus-stack Helm Chart:
For the helm chart to work the way we want we need to define some values in a myvalues.yaml
file. In this example we have configured Slack as an receiver for our alerts. Other available options include email, Telegram or Pagerduty. You can also use custom webhooks for receivers that aren’t natively supported, e.g. MS Teams. For a full list of available options please consult the Prometheus documentation.
If you want to configure additional things such as Grafana to visualize your monitoring please consult the charts default values.yaml file to add your own custom values, but here is a basic myvalues.yaml
that does the job:
prometheus:
prometheusSpec:
podMonitorSelectorNilUsesHelmValues: false
serviceMonitorSelectorNilUsesHelmValues: false
ruleSelectorNilUsesHelmValues: false
probeSelectorNilUsesHelmValues: false
alertmanager:
config:
# ref.: https://github.com/prometheus/alertmanager/blob/main/doc/examples/simple.yml
route:
group_by: ['alertname', 'cluster', 'service']
# default receiver
receiver: 'null'
routes:
- receiver: 'null'
matchers:
- alertname =~ "InfoInhibitor|Watchdog"
- receiver: 'slack'
matchers:
- severity =~ "warning|critical"
receivers:
- name: 'null'
- name: 'slack'
slack_configs:
- send_resolved: true
api_url: '' # <- Your Slack webhook URL goes here
channel: '#slack-channel'
If you are using Slack as a receiver I would also highly recommend you to check out this awesome alerting template: click me.
We can inject the custom template file inside our Helm values via the alertmanager.templateFiles
key. I am just not adding it in this example to keep it simple.
Once we’ve finished configuring our values we are ready to install the Chart:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install -f myvalues.yaml kube-prometheus-stack prometheus-community/kube-prometheus-stack
⚠️ In this example we will install kube-prometheus-stack in the default namespace but keep in mind that it is best practice to create a separate namespace for this.
Installing the Blackbox Exporter #
Now that Prometheus is up and running we need to install the Blackbox Exporter which will execute the actual probes and responds with metrics that Prometheus will be able to understand and scrape:
helm install prometheus-blackbox-exporter prometheus-community/prometheus-blackbox-exporter
Configuring the Ingress probe #
After that comes the interesting part: We need to tell Prometheus what it should monitor. For our Ingress hosts we need to create a Probe manifest:
apiVersion: monitoring.coreos.com/v1
kind: Probe
metadata:
name: blackbox-kubernetes-ingress
spec:
jobName: blackbox-kubernetes-ingress
interval: 30s
module: http_2xx
prober:
url: prometheus-blackbox-exporter:9115
scheme: http
path: /probe
targets:
ingress:
namespaceSelector:
any: true
relabelingConfigs:
- sourceLabels: [__param_target, __meta_kubernetes_ingress_annotation_monitoring_mycompany_com_health_probe_path] # custom Ingress Annotation: monitoring.mycompany.com/health-probe-path
regex: (.*\w)(\/|);(.+) # matches target host as capture group $1, trailing slash (if applicable) as capture group $2 and health-probe-path annotation as capture group $3
replacement: $1$3 # combine target host and health-probe-path annotation without trailing slash of target (leading slash in annotation is needed, e.g. "/healthz")
targetLabel: __param_target
Here we have defined that the target for our probe should be an Ingress and then we use the namespaceSelector
to select every Ingress in any namespace.
There is also a relabelingConfig
included as an example. relabelingConfigs
are a powerful tool to customize your probes - in this example we’ve added the possibility to use an Ingress annotation to alter our health probe path. E.g. if your application has implemented a health check at /healthz
you can simply use the Ingress annotation monitoring.mycompany.com/health-probe-path: '/healthz'
to tell Prometheus that this path should be used instead of the Ingress root path.
More information about relabelingConfigs can be found here.
Creating the Alert Rule #
Last but not least we need to tell Prometheus that it should create an alert that will be sent to our receiver if the probe fails:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: ingress-probe-failure
spec:
groups:
- name: ingress-probe-failure
rules:
- alert: IngressProbeFailure
expr: probe_success{job="blackbox-kubernetes-ingress"} == 0
for: 5m
labels:
severity: warning
annotations:
summary: "Ingress Host is down!"
description: "The Ingress Host {{ $labels.instance }} ({{ $labels.ingress }} / {{ $labels.namespace }}) failed to respond with a valid status code for at least 5 minutes."
Now once an Ingress probe is unhealthy for at least 5 minutes it will fire an alert with severity warning
which is routed to our Slack receiver as previously configured.
Please keep in mind that there are a lot more things you can configure & do in Prometheus and we are barely scratching the surface here, but that would be too much for this blog post.