2:14 AM. The pager said "payments-api RESTARTS=47." I rolled over, opened my laptop, and watched a pod get born and killed in perfect rhythm. Every 30 seconds: Created, Running, Terminated, Created. kubectl apply from the deploy pipeline had come back green an hour ago. The pod was, technically, created. It just kept getting killed a second later. The RESTARTS counter was at 51 by the time I finished typing kubectl describe. I had seen this shape before. A limit set by somebody who never ran the workload under real traffic, now meeting real traffic at 2 in the morning.
The scenario
The pod was never created. The API server rejected it.
A namespace has a ResourceQuota capping limits.cpu at 4. A new pod requests limits.cpu: 8 — twice the namespace ceiling. The API server's quota admission plugin catches this before the object reaches etcd. The pod never exists. The error appears in kubectl output, not in pod events.
The pod never reaches the scheduler — look at kubectl output, not pod events
When ResourceQuota blocks a create, the object is never written to etcd. There is no pod to describe, no events to read. The error lives in the exit code and stderr ofkubectl apply. Check pipeline logs or run the apply again — the message is immediate and explicit.
ResourceQuota is enforced at the API server, before scheduling
The quota admission plugin runs as part of the API server admission chain, after authentication and authorization but before the object persists. Check the current quota state with kubectl describe resourcequota -n default to see how much of each resource is used and what the ceiling is. The limits.cpu field counts the sum of all container limits in the namespace.
Either raise the quota or lower the pod's request
Two paths: patch the ResourceQuota object to raise the ceiling (kubectl edit resourcequota -n default), or reduce the pod's limits.cpu to fit within what remains. The right answer depends on whether the namespace quota is protecting other tenants from resource exhaustion — do not blindly raise it without checking who else shares the namespace.
This one lives in the troubleshooting repo. Clone it, apply the broken manifest, and you can reproduce the exact loop I was staring at.
git clone https://github.com/vellankikoti/troubleshoot-kubernetes-like-a-pro.git
cd troubleshoot-kubernetes-like-a-pro/scenarios/failed-resource-limits
lsYou will see issue.yaml, fix.yaml, and a short description.md. The issue manifest runs polinux/stress asking for 64M of memory under a 32Mi limit. Twice the headroom it is allowed. Perfect CrashLoop material.
Reproduce the issue
kubectl apply -f issue.yaml
# pod/failed-resource-limits-pod created
kubectl get pods -wWait sixty seconds and the RESTARTS column starts climbing like a stopwatch.
NAME READY STATUS RESTARTS AGE
failed-resource-limits-pod 0/1 CrashLoopBackOff 5 (12s ago) 2mFive restarts in two minutes. The pod is not flaky. It is a dead machine being revived over and over.
Debug the hard way
kubectl describe pod failed-resource-limits-podBuried in the events:
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Limits:
memory: 32Mi
Command: stress --vm 1 --vm-bytes 64MTwo fields, one answer. The limit is 32Mi. The workload wants 64M.
kubectl logs failed-resource-limits-pod --previousstress: info: [1] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
stress: FAIL: [1] (415) <-- worker 7 got signal 9
stress: FAIL: [1] (451) failed run completed in 0sSignal 9 is the kernel saying "I killed this on purpose." No application bug. No race condition. Just cgroups doing their job.
kubectl get pod failed-resource-limits-pod -o jsonpath='{.status.containerStatuses[0].restartCount}{"\n"}'
# 11Why this happens
Memory limits in Kubernetes are not soft targets. They are hard ceilings enforced by the Linux kernel through cgroups. When your container tries to allocate past its limit, the kernel does not send a warning or a graceful shutdown. It fires the OOM killer, the process dies with exit code 137, and the kubelet dutifully restarts the pod because restartPolicy defaults to Always. That loop runs forever. The backoff caps at five minutes, so you get one dead pod every five minutes until a human notices.
The mental model I wish somebody had drawn for me in year one: a pod with a limit below its actual memory need is not a bug. It is a permanent kill switch. CrashLoopBackOff is not a transient state here, it is the steady state. No amount of patience or retries will fix it because nothing about the workload is going to change between attempts.
The lesson from the field is that limits are a contract with the kernel, not a guideline for the scheduler. Write the contract wrong and the kernel enforces it exactly.
The fix
kubectl delete -f issue.yaml
kubectl apply -f fix.yamlThe diff is two lines:
command: ["stress", "--vm", "1", "--vm-bytes", "50M"]
resources:
limits:
memory: "256Mi"256Mi for a 50M workload. That looks wasteful until you remember that memory pages are free, outages are not.
kubectl get pod failed-resource-limits-fixed-pod
# failed-resource-limits-fixed-pod 1/1 Running 0 1mZero restarts. Steady state.
The lesson
- CrashLoopBackOff plus OOMKilled equals a memory limit below real usage. It will not self-heal. Stop waiting.
- The RESTARTS counter is the most honest metric in Kubernetes. A climbing number means something is fundamentally wrong, not transiently wrong.
- Set memory limits to peak observed usage times 1.5, minimum. Headroom is the cheapest insurance you can buy.
Day 15 of 35. Tomorrow we go one layer deeper, into the cgroup itself, where the kernel makes the decisions Kubernetes only reports.
