koti.dev
← The Runbook
Mastering Kubernetes the Right Way · DAY 05 / 35

Kubernetes Pod Running But Not Ready: Readiness Probe Failures Explained

STATUS says Running. READY says 0/1. Users see 503s. Here is where the pod is hiding.

KV
Koti Vellanki24 Mar 20265 min read
kubernetesdebuggingprobes
Kubernetes Pod Running But Not Ready: Readiness Probe Failures Explained

2:48 AM. Brand new microservice rollout, staging was green, production deploy finished, every pod shows Running with zero restarts. And the frontend is returning 503 no healthy upstream on every request. I stare at kubectl get pods, see Running, and waste twenty minutes looking at the wrong column. When I finally notice READY says 0/1, not 1/1, the whole story clicks. The pod is alive. The Service has silently excluded it from the endpoints list. Somebody renamed /health to /healthz in a tidy-up PR and the readiness probe is still pointed at the old path. The container is fine. The Service is hiding it.

The scenario

DAY 6 · APP · READINESS PROBE

The pod is Running. It is not Ready.

Every pod shows Running with zero restarts. The Service is empty. A failing readiness probe moves the pod to notReadyAddresses — it is alive, it is running, and the Service has quietly removed it from rotation. Traffic never arrives.

FIGURE06 / 35
Readiness probe failure — pod-3 excluded from Service EndpointsA Kubernetes Service routes traffic via its Endpoints object. pod-1 and pod-2 are in readyAddresses and receive traffic. pod-3 has a failing readiness probe and is placed in notReadyAddresses — the Service will not route to it, but the pod keeps running.LOAD BALANCERService api-svctype: ClusterIPport: 80 → 80803routeENDPOINTSapi-svc · default ns10.244.1.5ready10.244.1.6ready10.244.1.7notReadynotReadyAddresses↑ excluded from routing2KUBERNETES CLUSTERproduction · us-east-1 · v1.30READYpod-11/1RunningREADYpod-21/1RunningREADY=0pod-30/1LIVENESS: ⚠1200excluded
1

pod-3 is Running but not Ready

The readiness probe is failing — HTTP GET returns non-200 or times out. kubelet sets Ready=False on the PodCondition. The pod process keeps running. No restart, no crash. Just invisible to the Service.

2

The Endpoints object silently excludes it

The Endpoints controller watches PodConditions. A pod with Ready=False moves to notReadyAddresses. Run kubectl get endpoints api-svc -o yaml to see exactly which IPs are excluded.

3

Traffic never reaches the failing pod

kube-proxy only programs rules for readyAddresses. From the load balancer's perspective, the pod does not exist. Users see 503 no healthy upstream if enough pods fail. Check kubectl describe pod pod-3 and look for the probe failure event.

Kubernetes
Ready / allowed
notReady / excluded
Traffic path
◆ koti.dev / runbook
A failing readiness probe moves pod-3 to notReadyAddresses. The Service stops routing traffic to it while the pod keeps running.
A Kubernetes Service routes traffic through an Endpoints object. Two pods are in readyAddresses and receive traffic. A third pod has a failing readiness probe and is in notReadyAddresses — the Service excludes it from routing but the pod stays running.
10.244.1.5–7 (kind default Calico CIDR, in-cluster only — not RFC 5737) · pod.spec.containers.readinessProbe — kubectl explain pod.spec.containers.readinessProbe · endpoints.subsets.notReadyAddresses — kubectl explain endpoints.subsets.notReadyAddresses · kind v0.22.0, Kubernetes 1.30.0 — kubectl get endpoints shows notReadyAddresses split

From my troubleshoot-kubernetes-like-a-pro repo. You are going to reproduce the exact "pod looks fine but traffic is dead" situation and learn which column actually matters.

bash
git clone https://github.com/vellankikoti/troubleshoot-kubernetes-like-a-pro.git cd troubleshoot-kubernetes-like-a-pro/scenarios/readiness-probe-failure ls
bash

description.md, issue.yaml, fix.yaml. Assumes you have a cluster running from Day 0.

Reproduce the issue

bash
kubectl apply -f issue.yaml kubectl get pods
bash

Wait fifteen seconds so the probe has time to fail the threshold.

plaintext
NAME READY STATUS RESTARTS AGE readiness-probe-issue-pod 0/1 Running 0 45s

Running. Zero restarts. Everything in the STATUS column looks healthy. The READY column is the one telling you the truth: 0/1. The container is alive, the Service sees it as unhealthy, and no traffic is reaching it.

Terminal: cd into scenarios/readiness-probe-failure, ls the folder showing description.md, fix.yaml, issue.yaml, kubectl apply -f issue.yaml, kubectl get po showing readiness-probe-issue-pod with READY 0/1, STATUS Running, RESTARTS 0, AGE 6s and then 20s
Apply issue.yaml and run kubectl get po twice, 15 seconds apart. STATUS stays Running the whole time, RESTARTS stays 0, but READY is stuck at 0/1. The STATUS column is lying by omission — the pod is alive but invisible to its Service.

Debug the hard way

Logs.

bash
kubectl logs readiness-probe-issue-pod
bash
plaintext
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration 2026/03/24 02:48:01 [notice] 1#1: start worker processes 2026/03/24 02:48:01 [notice] 1#1: start worker process 29

Nginx is happy. No errors. So describe it.

bash
kubectl describe pod readiness-probe-issue-pod
bash
plaintext
Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning Unhealthy 3s (x14 over 40s) kubelet Readiness probe failed: HTTP probe failed with statuscode: 404

Ready: False. Fourteen probe failures. HTTP 404. Nginx is serving, but it is serving 404 for /nonexistent, and the kubelet treats any non-2xx/3xx as a probe failure. After three consecutive failures the pod is marked not ready, and the endpoints controller quietly yanks it out of every Service that matched its labels.

Terminal: kubectl get po showing readiness-probe-issue-pod Running but 0/1, then kubectl describe pod showing the pod spec with Readiness: http-get http://:80/nonexistent delay=1s timeout=1s period=3s #success=1 #failure=3 and Ready: False
kubectl describe — the Readiness line shows the probe is hitting /nonexistent and the Ready condition is False. This is the one block you scroll to — everything above it is noise when the pod is Running but not Ready.
Terminal: bottom half of kubectl describe showing the Conditions block with Ready: False, ContainersReady: False, and the Events table with a Warning Unhealthy event repeated 14 times over 40 seconds with message 'Readiness probe failed: HTTP probe failed with statuscode: 404'
The Conditions block confirms Ready is False. The Events table tells you why — 14 probe failures in 40 seconds, all returning HTTP 404. Nginx is serving, just not on the path your probe is asking for.

Why this happens

A readiness probe is a contract between the pod and the Service. When the probe passes, the endpoints controller keeps the pod's IP in the Service's endpoints list. When the probe fails, the pod's IP is silently removed. The container keeps running. The Service stops routing to it. There is no restart, no crash, no error in the pod phase, just a quiet exclusion.

This is why Running is not the same as Ready. Running means the container process is alive. Ready means the kubelet has decided the container can serve traffic. Two different questions, two different columns, and the STATUS column is the one that lies by omission.

The fix

bash
kubectl apply -f fix.yaml kubectl get pods
bash

The only change is the probe path. Broken:

yaml
readinessProbe: httpGet: path: /nonexistent port: 80
yaml

Fixed:

yaml
readinessProbe: httpGet: path: / port: 80
yaml

Nginx serves / with a 200, the probe passes, the pod joins the Service endpoints within seconds.

plaintext
NAME READY STATUS RESTARTS AGE readiness-probe-fixed-pod 1/1 Running 0 12s

1/1. That is the column you actually care about.

Terminal: cat issue.yaml showing the readinessProbe with httpGet path /nonexistent and port 80, then cat fix.yaml showing the same probe but with path / and port 80
The entire diff is one character. /nonexistent returns 404, / returns 200, and the probe flips from failing to passing. The container never changed, only the contract the probe was checking.

The easiest way — with Kubilitics

The Pods view splits the READY column out of the STATUS conflation and shows you the probe state directly, so a pod stuck Running but NotReady is badged as NotReady instead of hiding under a green-sounding Running label.

Kubilitics Pods view showing 2 pods total, 1 running, 0 failed. readiness-probe-fixed-pod with a green Running badge and READY 1/1, readiness-probe-issue-pod with a NotReady badge
Kubilitics Pods view after the fix. NotReady is its own status, not hidden behind Running. The fixed pod is green and 1/1, the original is flagged NotReady — you see the truth in a single scan of the table.

The lesson

  1. Running is not Ready. Always read the READY column. 0/1 Running is a pod that is alive and invisible to its Service.
  2. kubectl get endpoints is the fastest confirmation. If the endpoint list is empty and your pods say Running, you have a probe or label selector problem, nothing else.
  3. The probe path is a contract your app owns. Write /healthz on purpose, serve a real 200, and keep the probe aligned with the route.

Day 5 of 35 — tomorrow the probe does not just hide the pod, it kills it.

◆ Newsletter

Get the next post in your inbox.

Real Kubernetes lessons from seven years in production. One email when a new post drops. No spam. Unsubscribe in one click.