Where this fits in K8s strategy
Stop unresponsive OPA webhooks from causing false loops in managed K8s services to keep replacing nodes
Why it’s important
The constant loop of trying to replace nodes can cause clusters to crash. We don’t want that, do we?
Open Policy Agent (OPA) is a useful admission controller for Kubernetes. A lot of nasty things can enter your cluster if it’s not in front as a bouncer.
But it needs to be configured properly to do the job. It can trigger all kinds of issues otherwise. After all, webhooks are a single point of failure.
This story highlights the problem faced in a multi-tenant GKE cluster that was using OPA.
Call me Captain Obvious, but you need to validate that you’ve configured OPA correctly. Here are some methods you can follow from trusted sources: