koti.dev
← The Runbook
Mastering Kubernetes the Right Way · DAY 08 / 35

Why Is Your Kubernetes Pod Stuck in Pending? The Real Fix

45 minutes staring at a Pending pod, a Slack channel on fire, and one line of kubectl output that finally made sense.

KV
Koti Vellanki27 Mar 20267 min read
kubernetesdebuggingscheduling
Why Is Your Kubernetes Pod Stuck in Pending? The Real Fix

2:14 AM. The release train is halted, my manager is typing in Slack, and one pod in the api-server Deployment has been sitting in Pending for 45 minutes. Not crashing, not erroring, just Pending. I run kubectl get pods for the tenth time like the status might change out of sympathy. It does not.

The Deployment rolled out clean. The image pulled. The replica count moved from 3 to 4. And then the fourth pod just sat there, quietly refusing to exist. No logs to read, because the container never started. No events on the Deployment, because the Deployment did its job. The pod is healthy in every way except the one that matters: it is not running anywhere.

I take a breath and go talk to the scheduler yaar. That is where Pending lives.

The scenario

DAY 8 · SCHEDULING · NODE AFFINITY

The pod exists. No node will take it.

The kube-scheduler ran its filter loop on all three nodes and rejected every one. The pod's nodeAffinity requires tier=premium — a label none of the nodes carry. The pod will stay Pending until a matching node appears or the affinity is removed.

FIGURE8 / 35
Pod stuck Pending — nodeAffinity tier=premium matches no node in the clusterA pod with nodeAffinity requiring tier=premium cannot be scheduled because all three nodes in the cluster carry tier=standard or tier=basic. The kube-scheduler emits a FailedScheduling event: 0/3 nodes are available — none match nodeAffinity.PENDING PODunschedulableapi-clientnodeAffinity:tier=premium1cannot placeno matchKUBERNETES CLUSTERany cluster · v1.30NODEnode-1label:tier=standard✗ no matchNODEnode-2label:tier=standard✗ no matchNODEnode-3label:tier=basic✗ no match2KUBE-SCHEDULERFailedScheduling0/3 nodes available — none match nodeAffinity3
1

The pod carries an unsatisfied affinity

Its spec includes nodeAffinity: tier=premium. The scheduler must find a node where that label is present before it can bind the pod. Until then the pod sits in Pending — it exists in etcd, but no kubelet knows about it.

2

No node carries the required label

All three nodes were labelled at cluster creation: two with tier=standard, one with tier=basic. The filter loop rejects every candidate. Check current labels with kubectl get nodes --show-labels.

3

The scheduler logs the predicate failure

Run kubectl describe pod api-client and scroll to Events. You will see 0/3 nodes are available: 3 node(s) didn't match Pod's node affinity. That single line tells you the exact predicate — no log diving needed.

Kubernetes
Unschedulable pod
Scheduler rejection
Attempted placement
◆ koti.dev / runbook
An api-client pod with nodeAffinity tier=premium finds no matching node in a 3-node cluster.
A Pending pod labeled api-client sits outside the Kubernetes cluster. It carries a nodeAffinity selector requiring tier=premium. Inside the cluster three nodes are shown side by side: node-1 tier=standard, node-2 tier=standard, node-3 tier=basic. None match. Below the cluster the kube-scheduler reports 0/3 nodes available.
pod.spec.affinity.nodeAffinity — kubectl explain pod.spec.affinity.nodeAffinity · Pod status phase "Pending" — pod has been created but not yet scheduled · kind v0.22.0, Kubernetes 1.30.0 — kubectl describe pod shows "0/N nodes are available: N node(s) didn't match Pod's node affinity"

This is the exact shape of the problem, pulled from Day 0's cluster and the scenarios repo. You should already have a running cluster from Day 0; if not, spin one up with kind or minikube first.

bash
git clone https://github.com/vellankikoti/troubleshoot-kubernetes-like-a-pro.git cd troubleshoot-kubernetes-like-a-pro/scenarios/insufficient-resources ls
bash

You will see description.md, issue.yaml, fix.yaml, and a helper script. The interesting files are issue.yaml, which asks for more CPU and memory than a typical dev cluster has, and fix.yaml, which asks for a sane amount.

Reproduce the issue

Apply the broken manifest and watch the pod get stuck.

bash
kubectl apply -f issue.yaml kubectl get pod insufficient-resources-pod
bash
plaintext
NAME READY STATUS RESTARTS AGE insufficient-resources-pod 0/1 Pending 0 38s

Thirty-eight seconds, one minute, five minutes. The status never moves. That is the signal. A pod that is genuinely starting will flip through ContainerCreating in under a minute on a healthy cluster. A pod that stays in Pending past a minute is almost always a scheduling problem, not a runtime problem.

Debug the hard way

The first command I reach for is describe, because Pending problems live in the Events section.

bash
kubectl describe pod insufficient-resources-pod
bash

Scroll past the spec, the volumes, the conditions, and land on Events:

plaintext
Events: Type Reason From Message ---- ------ ---- ------- Warning FailedScheduling default-scheduler 0/1 nodes are available: 1 Insufficient cpu, 1 Insufficient memory. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.

Read that line carefully. 0/1 nodes are available. The scheduler checked every node in the cluster and rejected every one of them. The reason is broken down per filter: one node failed the CPU filter, the same node failed the memory filter. Kubernetes is telling you exactly which predicate fired.

Now I want to see what the pod is asking for and what the node actually has.

bash
kubectl get pod insufficient-resources-pod -o jsonpath='{.spec.containers[0].resources}'
bash
plaintext
{"requests":{"cpu":"2","memory":"4Gi"}}

Two whole CPUs and 4 gigs of memory, for a sleep 3600. Now the node side:

bash
kubectl describe node | grep -A 5 "Allocated resources"
bash
plaintext
Allocated resources: Resource Requests Limits cpu 850m (85%) 1 (100%) memory 512Mi (32%) 1Gi (64%)

The node has about 150 millicores of CPU headroom. The pod is asking for 2000. It does not fit. It will never fit. The scheduler is not broken, it is being honest.

Why this happens

The Kubernetes scheduler is a filter-and-score loop. For every unscheduled pod, it walks the node list, runs a set of predicates (CPU, memory, taints, affinity, volumes, ports), and throws out any node that fails any predicate. Whatever survives gets scored, and the highest score wins. If nothing survives, the pod stays Pending and the event log gets one FailedScheduling line.

The subtle part is that the scheduler compares against requests, not against actual usage. A node with 4 CPUs that is sitting at 5% utilisation will still refuse a pod asking for 3 CPUs if other pods have already requested 2. Requests are a reservation system. The scheduler is enforcing the reservation, not the live load.

This is also why "just restart it" never works for Pending pods. The scheduler is deterministic about this. Until something changes, either the pod's requests shrink or the cluster's capacity grows, the answer will be the same.

The fix

DAY 8 · SCHEDULING · NODE AFFINITY · FIXED

One label. Pod scheduled.

Adding tier=premium to node-2 gives the kube-scheduler exactly one candidate that passes the nodeAffinity predicate. The pod binds immediately and transitions from Pending to Running in seconds.

FIGURE8 / 35
Pod scheduled — node-2 relabelled tier=premium, api-client pod now RunningAfter running kubectl label node node-2 tier=premium the kube-scheduler finds a single matching node. The api-client pod binds to node-2 and transitions from Pending to Running.schedulednode-2 ✓KUBERNETES CLUSTERany cluster · v1.30NODEnode-1label:tier=standardNODEnode-2label:tier=premiumapi-client✓ ScheduledNODEnode-3label:tier=basic123
1

One label command unblocks the scheduler

Running kubectl label node node-2 tier=premium is the entire fix. The label is written to the node object in etcd. The scheduler's next filter pass — which happens within seconds — finds node-2 as the sole qualifying candidate.

2

Pod transitions Pending → Running

Once the scheduler binds the pod to node-2, the kubelet on that node pulls the image and starts the container. The pod's status.phase flips to Running — usually within 10 seconds on a warm image cache.

3

Nodes 1 and 3 are untouched

The other nodes keep their original labels. If the affinity required tier in (standard, premium) all three would qualify. Narrowing to tier=premium is a deliberate choice — pin to it only when the workload genuinely needs that hardware tier.

Kubernetes
Scheduled / running
Nodes
◆ koti.dev / runbook
Relabelling node-2 to tier=premium lets the api-client pod schedule and run.
A Kubernetes cluster containing three nodes side by side. Node-1 shows tier=standard. Node-2 now shows tier=premium highlighted in green, with the api-client pod scheduled inside it and a green Scheduled badge. Node-3 shows tier=basic. An allowed green arrow enters the cluster pointing at node-2.
kubectl label node node-2 tier=premium — kubectl-label(1) · pod.status.phase transitions Pending → Running once a matching node is found · kind v0.22.0, Kubernetes 1.30.0

Two paths. Shrink the pod or grow the cluster. For this scenario the pod is a toy, so shrinking is the right call.

bash
kubectl apply -f fix.yaml kubectl get pod insufficient-resources-fixed-pod
bash
plaintext
NAME READY STATUS RESTARTS AGE insufficient-resources-fixed-pod 1/1 Running 0 6s

The diff that matters:

yaml
resources: requests: cpu: "100m" # was "2" memory: "64Mi" # was "4Gi"
yaml

Six seconds from apply to Running. The scheduler was never the bottleneck, the request numbers were.

The lesson

  1. Pending is a scheduling verdict, not an error. Read the FailedScheduling event before anything else.
  2. The scheduler compares against requests, not live usage. A quiet node can still reject a greedy pod.
  3. Every Pending pod ends in one of three fixes: shrink the request, free up capacity, or relax a constraint. Figure out which before you touch any yaml.

Day 8 of 35 — tomorrow, a pod anti-affinity rule that the scheduler can never satisfy, no matter how many nodes you add.

◆ Newsletter

Get the next post in your inbox.

Real Kubernetes lessons from seven years in production. One email when a new post drops. No spam. Unsubscribe in one click.