Prometheus collects metrics data for health, performance and behaviour of K8s cluster via pull HTTP request
Why it’s important
Do you really want to have an app in production K8s without knowing how it’s doing on your infrastructure?
We’ll get into How to boostPrometheus in a minute, but let’s first review the basics.
Prometheus pulls metrics
Kubernetes can become complicated fast. That’s why the concept of “Observability” is getting more attention. This consists of monitoring, tracing and logging what’s going on.
Prometheus is fast becoming a popular tool for the monitoring needs of K8s projects. It follows a schedule to pull detailed metrics from pods, nodes and more.
What kind of metrics?
Before we get into Prometheus tactics, let’s uncover what metrics it can collect:
COUNTER
Measures values that only increase in value.
Examples: requests, tasks completed, errors.
GAUGE
Measures values that can go up or down.
Examples: memory usage, temperature.
HISTOGRAM
Shows frequency of a variable as a bar graph.
Use case: number of requests done vs. time
SUMMARY
It’s a metric type. But too tricky to explain here.
Prometheus is great, but it needs commands to pull useful metrics. Knowing the ins and outs of PromQL – a query language – can help with that.
Should your K8 effort use this tactic? Yes, if you have SLOs or are concerned about unknown performance issues.
What’s the big deal? Know how to scrape the right metrics to meet specific goals
PromQL for humans
This handy compilation of PromQL commands covers real-world issues. For example, one command will help you uncover instances with more than x% HTTP errors.
Prometheus can help uncover important alert-level metrics. But there’s no way to alert you direct from Prometheus. So you need to connect it to an alerting tool like PagerDuty, which will send an SMS or email alert.
Should your K8 effort use this tactic? Yes, if your Kubernetes cluster is running mission-critical applications or you have an SLO.
What’s the big deal? App went down and you didn’t know about it? This will fix that problem.
Connect Prometheus to PagerDuty
Follow this comprehensive guide for connecting your Prometheus metrics to PagerDuty’s alerting system.
Cortex is an affiliate tool of Prometheus. It pulls metrics from many independent Prometheus sources. It’s designed for advanced users running complex workloads.
Should your K8 effort use this tactic? Only if you run multiple clusters and need long-term metrics storage.
What’s the big deal? Faster PromQL scrapes and longer-term storage for large-scale K8s
Explore Cortex
You can review the Cortex documentation to see if it will suit your more complex Kubernetes deployment.