Seven years.
Dozens of clusters.
One playbook.
I'm Koti Vellanki — Sr. DevOps Engineer based in India. I've been on call for seven years, running Kubernetes across AWS, Azure, and GCP. This site is the field manual I wish I had when I started.
I've managed more than a hundred Kubernetes clusters in production. I've been the person at 2AM with a pager, staring at kubectl describe output, trying to figure out why checkout is down and the release is blocked.
After years of running twenty commands to understand one problem — mentally mapping service dependencies, guessing at blast radius, praying the scheduler would change its mind — I started building tools to make the invisible visible. The posts on this site are a byproduct of that work.
Every post comes from a real incident. No theory. No "according to the docs." Just what actually happened, what I tried, what failed, and what finally worked.
"Pending" doesn't mean broken. It means the scheduler is looking at every node and saying nope, nope, nope.
What I build
Kubilitics — a Kubernetes operating system I've been working on for the last two years. Cluster health scores, topology visualization, risk ranking, blast-radius simulation. It's the dashboard I wanted at 2AM.
troubleshoot-kubernetes-like-a-pro — an open-source repo with 35 real K8s failure scenarios. Deploy a broken config, debug it, apply the fix. The lab behind every post.
- 2018
Started in infrastructure. VMs, bare metal, scripts that shouldn't have worked.
- 2020
First production Kubernetes cluster. First 3AM page. First "it's always DNS."
- 2022
Multi-cluster, multi-region, multi-cloud. AWS EKS, Azure AKS, GCP GKE.
- 2024
Dozens of clusters across enterprises. Started writing the playbook in my head.
- 2025
Began building Kubilitics — the dashboard I wished I had at 2AM.
- 2026
Started "Mastering Kubernetes the Right Way" — 35 runbook entries from the trenches.