How to boost Prometheus for better Kubernetes monitoring

Where this fits in K8s strategy

Prometheus collects metrics data for health, performance and behaviour of K8s cluster via pull HTTP request

Why it’s important

Do you really want to have an app in production K8s without knowing how it’s doing on your infrastructure?

We’ll get into How to boost Prometheus in a minute, but let’s first review the basics.

Prometheus pulls metrics

Kubernetes can become complicated fast. That’s why the concept of “Observability” is getting more attention. This consists of monitoring, tracing and logging what’s going on.

Prometheus is fast becoming a popular tool for the monitoring needs of K8s projects. It follows a schedule to pull detailed metrics from pods, nodes and more.

What kind of metrics?

Before we get into Prometheus tactics, let’s uncover what metrics it can collect:

COUNTER

Measures values that only increase in value.

Examples: requests, tasks completed, errors.

GAUGE

Measures values that can go up or down.

Examples: memory usage, temperature.

HISTOGRAM

Shows frequency of a variable as a bar graph.

Use case: number of requests done vs. time

SUMMARY

It’s a metric type. But too tricky to explain here.

Read this if you still want to know.

How to setup Prometheus

Install guide: Prometheus

Follow this 5-step guide to get Prometheus up and running on your cluster.

Now, let’s boost your Prometheus!

You will increase your ROI from Prometheus with these tactics:

TACTIC #1 Visualise metrics

Prometheus can pull all sorts of metrics but doesn’t display it very well. Grafana can. It visualises grouped metrics on a dashboard.

Should your K8 effort use this tactic? Yes, every project will get between some and a lot of benefit.

What’s the big deal? Go from endless lines of data to clean visual guidance

Without Grafana – Prometheus displays raw metrics that can run thousands of lines
With Grafanametrics are now visual so you can pinpoint issues fast

Official Grafana setup guide

Follow this guide to get Grafana up and running alongside your Prometheus installation.

TACTIC #2 Query better metrics

Prometheus is great, but it needs commands to pull useful metrics. Knowing the ins and outs of PromQL – a query language – can help with that.

Should your K8 effort use this tactic? Yes, if you have SLOs or are concerned about unknown performance issues.

What’s the big deal? Know how to scrape the right metrics to meet specific goals

PromQL for humans

This handy compilation of PromQL commands covers real-world issues. For example, one command will help you uncover instances with more than x% HTTP errors.

TACTIC #3 Get issue alerts faster

Prometheus can help uncover important alert-level metrics. But there’s no way to alert you direct from Prometheus. So you need to connect it to an alerting tool like PagerDuty, which will send an SMS or email alert.

Should your K8 effort use this tactic? Yes, if your Kubernetes cluster is running mission-critical applications or you have an SLO.

What’s the big deal? App went down and you didn’t know about it? This will fix that problem.

Connect Prometheus to PagerDuty

Follow this comprehensive guide for connecting your Prometheus metrics to PagerDuty’s alerting system.

TACTIC #4 Pull heavy metrics loads

Cortex is an affiliate tool of Prometheus. It pulls metrics from many independent Prometheus sources. It’s designed for advanced users running complex workloads.

Should your K8 effort use this tactic? Only if you run multiple clusters and need long-term metrics storage.

What’s the big deal? Faster PromQL scrapes and longer-term storage for large-scale K8s

Explore Cortex

You can review the Cortex documentation to see if it will suit your more complex Kubernetes deployment.

Leave a Comment