koti.dev
← The Runbook
Mastering Kubernetes the Right Way · DAY 27 / 35

NetworkPolicy Default-Deny Broke My Whole Namespace. Here Is the Fix

One default-deny egress policy, one black-hole namespace, one very long pager night.

KV
Koti Vellanki15 Apr 20264 min read
kubernetesdebuggingnetworking
NetworkPolicy Default-Deny Broke My Whole Namespace. Here Is the Fix

Somebody wrote a "default deny" NetworkPolicy during the security review last week. Looked reasonable on paper, applied cleanly, everybody signed off. 2AM tonight, a fresh deployment rolls out into that namespace and every pod turns into a black hole. Running, Ready, but unable to reach the database, the metrics endpoint, even kube-dns. Liveness probes start failing because the kubelet itself tries to HTTP-GET the pod from the node and the kubelet's source IP is not in any allowlist. The pods start restarting. The restarts don't help because nothing changed on the pod side. The blast radius is the entire namespace and I'm the one holding the pager.

The scenario

DAY 27 · NETWORK · NAMESPACE POLICY

Pod-A issues the request. The backend NetworkPolicy drops it.

A default-deny ingress NetworkPolicy in the backend namespace makes pod-b isolated. Traffic from frontend is not on the allow list — the CNI drops the SYN silently. Pod-a sees a timeout; pod-b sees nothing.

FIGURE27 / 35
pod-a cannot reach pod-b — default-deny ingress NetworkPolicy in backend drops the SYNpod-a in namespace frontend sends a request on port 8080 to pod-b in namespace backend. A default-deny ingress NetworkPolicy in backend does not include frontend in its allow list. The CNI silently drops the packet and pod-a times out.NAMESPACEfrontendPODpod-aapp=web18080/tcp→ DROPPEDNAMESPACEbackendPODpod-bapp=apiallow → kube-systemdeny → from anywheresilent drop23
1

pod-a issues the request

Running curl http://pod-b:8080 from inside pod-a. DNS resolves cleanly. The SYN leaves the pod. Up to here, everything is normal.

2

The backend NetworkPolicy is default-deny

Any NetworkPolicy that selects pod-b and lists Ingress in policyTypes makes pod-b isolated. Only traffic from kube-system is explicitly allowed. Traffic from frontend is not on the list — the CNI drops the packet silently.

3

pod-a sees a timeout, pod-b sees nothing

No RST, no ICMP unreachable. The kernel on pod-b never receives the SYN. Run kubectl get networkpolicy -n backend to surface the policy. Then check both ingress.from rules and the source namespace labels.

Kubernetes
Blocked path
Application traffic
◆ koti.dev / runbook
pod-a in namespace frontend cannot reach pod-b in namespace backend — blocked by a default-deny ingress NetworkPolicy.
Two namespace cards side by side. Left card is namespace frontend containing pod-a with label app=web. Right card is namespace backend containing pod-b with label app=api and a NetworkPolicy stack showing allow from kube-system, deny from anywhere, and silent drop. A red dashed arrow from pod-a to pod-b is labeled 8080/tcp with sublabel DROPPED.
NetworkPolicy v1 networking.k8s.io — kubectl explain networkpolicy.spec.ingress · namespaceSelector — kubectl explain networkpolicy.spec.ingress.from.namespaceSelector · kind v0.22.0, Kubernetes 1.30.0, Calico 3.27 — kubectl exec into pod-a and curl pod-b stalls and times out

Reproduce it in your own cluster. You need a CNI that actually enforces NetworkPolicy for this to mean anything, Calico, Cilium, or Antrea. Plain flannel will accept the policy and silently ignore it.

bash
git clone https://github.com/vellankikoti/troubleshoot-kubernetes-like-a-pro.git cd troubleshoot-kubernetes-like-a-pro/scenarios/network-connectivity-issues ls
bash

You should see issue.yaml, fix.yaml, description.md, network_issue.sh. The issue file creates a pod plus a NetworkPolicy that denies all egress for pods with the label app: network-test.

Reproduce the issue

bash
kubectl apply -f issue.yaml
bash
plaintext
pod/network-connectivity-issue-pod created networkpolicy.networking.k8s.io/deny-egress-network-test created

The pod tries to wget http://google.com and fails:

bash
kubectl logs network-connectivity-issue-pod
bash
plaintext
blocked

The pod is Running, the container is happy, the wget timed out, the log says blocked, and nothing in the pod events tells you a NetworkPolicy is the cause. NetworkPolicy drops are silent. The kernel silently drops the packet at the CNI layer and the pod sees a connect timeout like it is an upstream problem.

Debug the hard way

First the usual checks, because you will run them anyway:

bash
kubectl get pod network-connectivity-issue-pod -o wide
bash
plaintext
NAME READY STATUS RESTARTS AGE IP NODE network-connectivity-issue-pod 1/1 Running 0 90s 10.244.1.17 worker-1

Pod is fine. Then DNS and direct reachability from inside the pod:

bash
kubectl exec network-connectivity-issue-pod -- wget -qO- --timeout=3 http://kubernetes.default || echo fail
bash
plaintext
fail

Even the API server Service is unreachable. That is the fingerprint of a default-deny egress policy. Now the command that actually matters, list every NetworkPolicy that selects this pod:

bash
kubectl get networkpolicy -A
bash
plaintext
NAMESPACE NAME POD-SELECTOR AGE default deny-egress-network-test app=network-test 2m
bash
kubectl describe networkpolicy deny-egress-network-test
bash
plaintext
Name: deny-egress-network-test Namespace: default PodSelector: app=network-test Policy Types: Egress Egress: <none>

Egress: <none> with Policy Types: Egress means "all egress denied for pods matching app=network-test". That is your answer, and no events, no logs, no pod conditions would have told you that. You had to go look for the policy yourself.

Why this happens

NetworkPolicy is additive in an interesting way: if no policy selects a pod, all traffic is allowed. If any policy selects a pod, then only the traffic explicitly allowed by all policies combined is permitted for the direction listed in policyTypes. So the moment you apply an empty egress policy that selects a pod, everything egress is denied unless you also add to rules. A lot of teams write this policy thinking "default deny" means "start from deny and then we layer allows on top", which is correct in intent but wrong in consequences, because they forget the allow-list layer.

The second trap is the kubelet health probe. The kubelet sends HTTP probes to the pod from the node's IP, which is not the pod network. An ingress policy that only allows traffic from podSelector in the same namespace will silently block the kubelet's probe, marking the pod Unhealthy and restarting it in a loop. The fix is an ingress rule allowing traffic from the node CIDR, or using exec probes instead of HTTP probes.

The third trap is DNS. A default-deny egress policy blocks traffic to kube-dns on port 53, which means every application call that uses a hostname fails before it even starts. Your allow-list needs an explicit rule allowing UDP and TCP to port 53 to the kube-system namespace, or nothing resolves.

The fix

DAY 27 · NETWORK · NAMESPACE POLICY

One allow rule. Traffic from frontend now reaches backend.

Adding a namespaceSelector rule that matches the frontend namespace unblocks pod-a → pod-b. NetworkPolicy is allowlist-based: one explicit rule is all it takes. The deny-all baseline stays in place for every other source.

FIGURE27 / 35
pod-a reaches pod-b — namespaceSelector allow rule in backend NetworkPolicyA new ingress rule in the backend NetworkPolicy adds namespaceSelector matching the frontend namespace label. pod-a can now reach pod-b on port 8080 and receives a 200 OK.NAMESPACEfrontendPODpod-aapp=web18080/tcp→ 200 OKNAMESPACEbackendPODpod-bapp=apiallow → ns frontendallow → kube-systemdeny → from anywhere23
1

pod-a issues the same request

Nothing changed on the source side. pod-a still runs curl http://pod-b:8080. The fix was entirely in the destination namespace policy.

2

One new allow rule — namespaceSelector

Adding from.namespaceSelector.matchLabels.name: frontend to the ingress rules in the backend NetworkPolicy is all that is required. First label the namespace: kubectl label namespace frontend name=frontend. The deny-all baseline stays in place for every other source.

3

Connection established — 200 OK

The CNI now matches the ingress rule and forwards the SYN to pod-b. Verify with kubectl exec -n frontend pod-a -- curl -s -o /dev/null -w "%{http_code}" http://pod-b.backend:8080.

Kubernetes
Allowed path
Application traffic
◆ koti.dev / runbook
A namespaceSelector allow rule in the backend NetworkPolicy lets pod-a reach pod-b on port 8080.
Two namespace cards side by side. Left card is namespace frontend containing pod-a. Right card is namespace backend containing pod-b and a PolicyStack with allow from ns frontend at the top, followed by allow from kube-system and deny from anywhere. A green solid arrow from pod-a to pod-b is labeled 8080/tcp with sublabel 200 OK.
NetworkPolicy v1 networking.k8s.io — kubectl explain networkpolicy.spec.ingress.from.namespaceSelector · Namespace labels — kubectl label namespace frontend name=frontend · kind v0.22.0, Kubernetes 1.30.0, Calico 3.27
bash
kubectl delete -f issue.yaml kubectl apply -f fix.yaml
bash

The scenario fix removes the NetworkPolicy entirely. In a real cluster you do not want to delete the security policy, you want to fix it. The correct pattern is a default-deny policy paired with explicit allows for DNS, for kubelet probes, and for the actual traffic the app needs:

yaml
apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-dns-and-app spec: podSelector: matchLabels: app: network-test policyTypes: - Egress egress: - to: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: kube-system ports: - protocol: UDP port: 53 - protocol: TCP port: 53 - to: - podSelector: matchLabels: app: api ports: - protocol: TCP port: 8080
yaml

Verify with a wget from the pod and with kubectl describe on the policy to make sure the rules render as you expect.

Day 27 of 35, tomorrow the cluster talks to itself perfectly but cannot reach the payment processor, and nothing in the cluster looks broken.

◆ Newsletter

Get the next post in your inbox.

Real Kubernetes lessons from seven years in production. One email when a new post drops. No spam. Unsubscribe in one click.