koti.dev
← The Runbook
Mastering Kubernetes the Right Way · DAY 26 / 35

It Is Always DNS: Debugging CoreDNS Failures in Kubernetes

The pod is healthy, the Service is up, nslookup hangs. The five-hop debug.

KV
Koti Vellanki14 Apr 20264 min read
kubernetesdebuggingnetworking
It Is Always DNS: Debugging CoreDNS Failures in Kubernetes

2AM, yaar, the application is throwing connection timeout to a Service that is definitely up. I kubectl exec into a healthy pod and wget the ClusterIP directly, works instantly. But wget http://payments.default.svc.cluster.local hangs for thirty seconds and dies. I already know what this is. Every SRE knows what this is. It is DNS. It has always been DNS. Yet every single time, my brain refuses to believe DNS is broken and spends twenty minutes checking kube-proxy and iptables first, because "DNS is infrastructure, DNS does not just break". DNS does just break. And when it does, the pod is healthy, the Service is healthy, and the error looks like application flake.

The scenario

DAY 20 · NETWORK · DNS

kube-dns is Ready. It is not resolving.

The pod's /etc/resolv.conf correctly points to kube-dns at 10.96.0.10. kube-dns reports Ready=True. But the CoreDNS Corefile's forward plugin points at a stale upstream resolver — so every out-of-cluster name returns SERVFAIL. Ready does not mean working.

FIGURE20 / 35
DNS SERVFAIL — CoreDNS Corefile forwards to a stale upstream, kube-dns Ready=True but not resolvingA pod sends a DNS query to kube-dns at 10.96.0.10. kube-dns is Running and Ready=True but its CoreDNS Corefile points the forward plugin at 192.0.2.53, a stale unreachable resolver. The upstream returns SERVFAIL which propagates back to the pod.KUBERNETES CLUSTERany cluster · v1.30POD · default nsapi-client/etc/resolv.confnameserver 10.96.0.10153/udpqueryKUBE-DNSCoreDNS · Ready=True. { errors forward . 192.0.2.53}↑ stale upstreamCorefile · forward plugin2forward→ SERVFAILEXTERNAL · UPSTREAM DNS(stale resolver)192.0.2.533
1

The pod asks kube-dns

kubelet writes /etc/resolv.conf with nameserver 10.96.0.10 — the canonical kube-dns ClusterIP. The pod's resolver config is correct; so is the kube-dns pod status (Ready=True). Nothing looks wrong yet.

2

The Corefile points at a stale upstream

The CoreDNS forward . 192.0.2.53 line was left over from a previous cluster config. kube-dns is healthy and answers in-cluster queries fine — but any out-of-cluster name it must forward returns SERVFAIL. Check with: kubectl -n kube-system get cm coredns -o yaml.

3

SERVFAIL propagates back to the pod

The upstream at 192.0.2.53 is unreachable (RFC 5737 TEST-NET-1). CoreDNS times out and returns SERVFAIL. The pod sees a DNS failure and the application throws a connection error — even though the target Service exists and is healthy.

Kubernetes
Misconfigured
SERVFAIL path
DNS query
◆ koti.dev / runbook
A pod queries kube-dns which forwards to a stale upstream (192.0.2.53) and returns SERVFAIL.
A pod inside a Kubernetes cluster sends a DNS query on port 53/udp to kube-dns at 10.96.0.10. kube-dns shows Ready=True but its CoreDNS Corefile forwards queries to a stale upstream at 192.0.2.53. The upstream is unreachable so kube-dns returns SERVFAIL to the pod.
53/udp (DNS) · 192.0.2.53 (RFC 5737 TEST-NET-1, documentation only — fake upstream) · kube-dns service IP — kubectl -n kube-system get svc kube-dns · CoreDNS Corefile — kubectl -n kube-system get cm coredns -o yaml · kind v0.22.0, Kubernetes 1.30.0, CoreDNS 1.11 — kubectl exec into a pod and dig against a misconfigured forward upstream returns SERVFAIL

Reproduce it in your own cluster so the hang you see matches the hang I describe.

bash
git clone https://github.com/vellankikoti/troubleshoot-kubernetes-like-a-pro.git cd troubleshoot-kubernetes-like-a-pro/scenarios/dns-resolution-failure ls
bash

You will see issue.yaml, fix.yaml, description.md, dns_failure.sh. The issue.yaml creates a pod with dnsPolicy: None and a bogus nameserver (192.0.2.1, an RFC 5737 TEST-NET address that nothing can reach). That is not how real DNS breaks in production, but the resulting symptom is identical, and that is what matters for practicing the debug loop.

Reproduce the issue

bash
kubectl apply -f issue.yaml
bash
plaintext
pod/dns-failure-pod created
bash
kubectl logs dns-failure-pod
bash
plaintext
Server: 192.0.2.1 Address 1: 192.0.2.1 nslookup: can't resolve 'kubernetes.default.svc.cluster.local'

The pod is Running but DNS is a wall. In production the signature is a little different, your pod looks fine on the outside, applications connect to some Services and not others, and retries make everything worse. But the underlying symptom is the same. DNS queries fail or time out, the pod keeps running, and nothing ties it all together in the events.

Debug the hard way

Five hops. Do them in order, do not skip.

Hop one, is CoreDNS even running:

bash
kubectl -n kube-system get pods -l k8s-app=kube-dns
bash
plaintext
NAME READY STATUS RESTARTS AGE coredns-5d78c9869d-4xk7v 1/1 Running 0 14d coredns-5d78c9869d-hq2mp 1/1 Running 0 14d

Hop two, does the Service exist and have endpoints:

bash
kubectl -n kube-system get svc kube-dns kubectl -n kube-system get endpoints kube-dns
bash

Both populated. Good. Hop three, what does the broken pod's /etc/resolv.conf look like:

bash
kubectl exec dns-failure-pod -- cat /etc/resolv.conf
bash
plaintext
nameserver 192.0.2.1

There it is. The nameserver is wrong. In production, the most common equivalent is nameserver 10.96.0.10 pointing at kube-dns correctly, but search lines or ndots:5 causing 5x slow lookups for every external hostname. Hop four, try a direct query bypassing search paths:

bash
kubectl run dns-test --rm -it --image=busybox --restart=Never -- \ nslookup -timeout=2 kubernetes.default.svc.cluster.local 10.96.0.10
bash

If that works but nslookup kubernetes from the app pod fails, the problem is the pod's /etc/resolv.conf, not CoreDNS itself. Hop five, look at CoreDNS logs:

bash
kubectl -n kube-system logs -l k8s-app=kube-dns --tail=50
bash

Errors like i/o timeout talking to upstream mean CoreDNS can reach the cluster but cannot reach its configured upstream. That's a cluster networking or firewall issue, not a DNS config issue.

Why this happens

CoreDNS is a recursive resolver with a Kubernetes plugin. When a pod sends a query for payments.default.svc.cluster.local, CoreDNS answers from its cache of Service records and returns the ClusterIP instantly. When a pod queries google.com, CoreDNS forwards the query upstream, usually to the node's /etc/resolv.conf nameservers, and proxies the answer back.

Two things break this, and they look identical from the app. The first is ndots:5. Most pods have ndots:5 in their resolv.conf, which means any hostname with fewer than five dots gets run through every entry in the search list before the literal name is tried. A lookup for api.stripe.com becomes api.stripe.com.default.svc.cluster.local, then api.stripe.com.svc.cluster.local, then api.stripe.com.cluster.local, then finally api.stripe.com.. Four wasted round trips, 4x DNS load, 4x the chance one of them times out under pressure. The fix is either to add a trailing dot to hostnames in the app or to set dnsConfig.options with a lower ndots.

The second is CoreDNS's upstream timing out. CoreDNS forwards unknown names to the node's upstream resolver. If that resolver is flaky, if the network to it is flaky, or if the upstream DNS server itself is slow, every external lookup from every pod in the cluster hangs. You'll see the log line forward: i/o timeout in CoreDNS, repeatedly, and the app looks like it is "randomly" slow.

The fix

bash
kubectl delete -f issue.yaml kubectl apply -f fix.yaml
bash

The scenario fix is trivial: switch from dnsPolicy: None back to dnsPolicy: ClusterFirst:

yaml
spec: dnsPolicy: ClusterFirst
yaml
bash
kubectl logs dns-fixed-pod
bash
plaintext
Server: 10.96.0.10 Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local Name: kubernetes.default.svc.cluster.local Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local

In a real cluster, the fix depends on which hop failed. Broken resolv.conf means fix the pod spec or the node's kubelet. Broken upstream means fix CoreDNS's Corefile forward directive or fix the network to the upstream resolver. Scaling issues mean scale CoreDNS replicas and tune autoscaling.

The lesson

  1. When applications randomly time out and direct ClusterIP calls work, stop debugging the app and check DNS. Always.
  2. ndots:5 is the single biggest cause of slow external DNS in Kubernetes. If you resolve a lot of external names, set ndots to 2 in the pod's dnsConfig.
  3. Do the five hops in order. CoreDNS pods, kube-dns service and endpoints, pod resolv.conf, direct query to CoreDNS, CoreDNS logs. The answer is always in one of those five.

Day 26 of 35, tomorrow someone merges a NetworkPolicy during the security review and accidentally cuts the kubelet off from the pod it needs to probe.

◆ Newsletter

Get the next post in your inbox.

Real Kubernetes lessons from seven years in production. One email when a new post drops. No spam. Unsubscribe in one click.