Kubernetes does away with manual scaling
I remember the old days of manual scaling instances – on-prem. Those were different times.
Scaling instances meant by-hand installs on more physical machines. Same for increasing machine resources – transfer to a machine with more juice. Or stick more RAM in!
Kubernetes does away with all that by taking full advantage of virtual machines. It employs the concept of autoscaling.
Here are 3 ways to scale within your cluster:
Horizontal Pod Autoscaler
Vertical Pod Autoscaler
Let’s cover each of these in detail:
Horizontal Pod Autoscaler (HPA)
Mechanism: Increases the number of pods that are serving up the application
– Responsive to usage patterns described by observability metrics
– Reduces resource wastage by scaling down once traffic spikes stop
– Does not scale up daemonsets like Jaeger tracing
– Time lag between need detection vs actual autoscaling might not suit some needs
– Risk of unnecessarily scale-up on bot traffic if you don’t have controls against that!
Vertical Pod Autoscaler (VPA)
Mechanism: Expands the resources available to the pod
– Automatically suggests the values for CPU and memory request limits
– Cuts the headache of having to work out limits yourself
– Reduces risk of accidentally choking pod’s resources
– Doesn’t work well with Java because JVM gives a limited view into memory usage
– Might not play nice with HPA if both run at the same time
– Possible to exceed resource quota if you don’t set LimitRange objects
Cluster Autoscaler (CA)
Mechanism: Expands/contracts nodes based on a check count of pods pending.
– Helps make better use of idle nodes within your cluster
– Consolidates all currently running pods on a smaller number of nodes
– Creates nodes for pods that can’t run because of resource constraints in existing nodes
– Ensures safe transfer of running pod onto another node using PodDisruptionBudgets
– Risk of failure/stalling if you set a high replica count to scale up to
– Pods might sometimes not fit in existing or newly created nodes
– Works best (and maybe only) with homogenous nodes and not ones that can run different types of instances like in AWS Managed Nodes
Can you do all 3 types of scaling at once?
You can but if you decide to try this out, keep these compatibility issues in mind:
- I mentioned earlier that VPA might have issues if HPA sets targets based on resource limits like CPU and memory usage.
- If you want to run both VPA and HPA at the same time, make sure HPA’s scaling targets are set based on custom or external metrics.
- CA doesn’t share this incompatibility issue with the other 2 autoscalers. That’s because it works at the node-level rather than the pod-level that HPA/VPA run on.