3 ways K8s can help you do better scaling

Kubernetes does away with manual scaling

I remember the old days of manual scaling instances – on-prem. Those were different times.

Scaling instances meant by-hand installs on more physical machines. Same for increasing machine resources – transfer to a machine with more juice. Or stick more RAM in!

Kubernetes does away with all that by taking full advantage of virtual machines. It employs the concept of autoscaling.

Here are 3 ways to scale within your cluster:

Horizontal Pod Autoscaler

Vertical Pod Autoscaler

Cluster Autoscaler

Let’s cover each of these in detail:

Horizontal Pod Autoscaler (HPA)

Mechanism: Increases the number of pods that are serving up the application

Benefits:
– Responsive to usage patterns described by observability metrics
– Reduces resource wastage by scaling down once traffic spikes stop

Issues:
– Does not scale up daemonsets like Jaeger tracing
– Time lag between need detection vs actual autoscaling might not suit some needs
– Risk of unnecessarily scale-up on bot traffic if you don’t have controls against that!

Vertical Pod Autoscaler (VPA)

Mechanism: Expands the resources available to the pod

Benefits:
– Automatically suggests the values for CPU and memory request limits
– Cuts the headache of having to work out limits yourself
– Reduces risk of accidentally choking pod’s resources

Issues:
– Doesn’t work well with Java because JVM gives a limited view into memory usage
– Might not play nice with HPA if both run at the same time
– Possible to exceed resource quota if you don’t set LimitRange objects

Cluster Autoscaler (CA)

Mechanism: Expands/contracts nodes based on a check count of pods pending.

Benefits:
– Helps make better use of idle nodes within your cluster
– Consolidates all currently running pods on a smaller number of nodes
– Creates nodes for pods that can’t run because of resource constraints in existing nodes
– Ensures safe transfer of running pod onto another node using PodDisruptionBudgets

Issues:
– Risk of failure/stalling if you set a high replica count to scale up to
– Pods might sometimes not fit in existing or newly created nodes
– Works best (and maybe only) with homogenous nodes and not ones that can run different types of instances like in AWS Managed Nodes

Can you do all 3 types of scaling at once?

You can but if you decide to try this out, keep these compatibility issues in mind:

  • I mentioned earlier that VPA might have issues if HPA sets targets based on resource limits like CPU and memory usage.
  • If you want to run both VPA and HPA at the same time, make sure HPA’s scaling targets are set based on custom or external metrics.
  • CA doesn’t share this incompatibility issue with the other 2 autoscalers. That’s because it works at the node-level rather than the pod-level that HPA/VPA run on.

HorizontalPod Autoscaler Icon made by Flat Icons from www.flaticon.com

Cluster Autoscaler Icon made by Freepik from www.flaticon.com

Leave a Comment