NetworkPolicy Default-Deny Broke My Whole Namespace. Here Is the Fix

Somebody wrote a "default deny" NetworkPolicy during the security review last week. Looked reasonable on paper, applied cleanly, everybody signed off. 2AM tonight, a fresh deployment rolls out into that namespace and every pod turns into a black hole. Running, Ready, but unable to reach the database, the metrics endpoint, even kube-dns. Liveness probes start failing because the kubelet itself tries to HTTP-GET the pod from the node and the kubelet's source IP is not in any allowlist. The pods start restarting. The restarts don't help because nothing changed on the pod side. The blast radius is the entire namespace and I'm the one holding the pager.

The scenario

◆ DAY 27 · NETWORK · NAMESPACE POLICY

Pod-A issues the request. The backend NetworkPolicy drops it.

A default-deny ingress NetworkPolicy in the backend namespace makes pod-b isolated. Traffic from frontend is not on the allow list — the CNI drops the SYN silently. Pod-a sees a timeout; pod-b sees nothing.

FIGURE27 / 35

pod-a issues the request

Running curl http://pod-b:8080 from inside pod-a. DNS resolves cleanly. The SYN leaves the pod. Up to here, everything is normal.

The backend NetworkPolicy is default-deny

Any NetworkPolicy that selects pod-b and lists Ingress in policyTypes makes pod-b isolated. Only traffic from kube-system is explicitly allowed. Traffic from frontend is not on the list — the CNI drops the packet silently.

pod-a sees a timeout, pod-b sees nothing

No RST, no ICMP unreachable. The kernel on pod-b never receives the SYN. Run kubectl get networkpolicy -n backend to surface the policy. Then check both ingress.from rules and the source namespace labels.

pod-a in namespace frontend cannot reach pod-b in namespace backend — blocked by a default-deny ingress NetworkPolicy.

Two namespace cards side by side. Left card is namespace frontend containing pod-a with label app=web. Right card is namespace backend containing pod-b with label app=api and a NetworkPolicy stack showing allow from kube-system, deny from anywhere, and silent drop. A red dashed arrow from pod-a to pod-b is labeled 8080/tcp with sublabel DROPPED.

NetworkPolicy v1 networking.k8s.io — kubectl explain networkpolicy.spec.ingress · namespaceSelector — kubectl explain networkpolicy.spec.ingress.from.namespaceSelector · kind v0.22.0, Kubernetes 1.30.0, Calico 3.27 — kubectl exec into pod-a and curl pod-b stalls and times out

Reproduce it in your own cluster. You need a CNI that actually enforces NetworkPolicy for this to mean anything, Calico, Cilium, or Antrea. Plain flannel will accept the policy and silently ignore it.

bash

git clone https://github.com/vellankikoti/troubleshoot-kubernetes-like-a-pro.git
cd troubleshoot-kubernetes-like-a-pro/scenarios/network-connectivity-issues
ls

bash

You should see issue.yaml, fix.yaml, description.md, network_issue.sh. The issue file creates a pod plus a NetworkPolicy that denies all egress for pods with the label app: network-test.

Reproduce the issue

bash

kubectl apply -f issue.yaml

bash

plaintext

pod/network-connectivity-issue-pod created
networkpolicy.networking.k8s.io/deny-egress-network-test created

The pod tries to wget http://google.com and fails:

bash

kubectl logs network-connectivity-issue-pod

bash

plaintext

blocked

The pod is Running, the container is happy, the wget timed out, the log says blocked, and nothing in the pod events tells you a NetworkPolicy is the cause. NetworkPolicy drops are silent. The kernel silently drops the packet at the CNI layer and the pod sees a connect timeout like it is an upstream problem.

Debug the hard way

First the usual checks, because you will run them anyway:

bash

kubectl get pod network-connectivity-issue-pod -o wide

bash

plaintext

NAME                             READY   STATUS    RESTARTS   AGE   IP           NODE
network-connectivity-issue-pod   1/1     Running   0          90s   10.244.1.17  worker-1

Pod is fine. Then DNS and direct reachability from inside the pod:

bash

kubectl exec network-connectivity-issue-pod -- wget -qO- --timeout=3 http://kubernetes.default || echo fail

bash

plaintext

fail

Even the API server Service is unreachable. That is the fingerprint of a default-deny egress policy. Now the command that actually matters, list every NetworkPolicy that selects this pod:

bash

kubectl get networkpolicy -A

bash

plaintext

NAMESPACE   NAME                        POD-SELECTOR       AGE
default     deny-egress-network-test    app=network-test   2m

bash

kubectl describe networkpolicy deny-egress-network-test

bash

plaintext

Name:         deny-egress-network-test
Namespace:    default
PodSelector:  app=network-test
Policy Types: Egress
Egress:
  <none>

Egress: <none> with Policy Types: Egress means "all egress denied for pods matching app=network-test". That is your answer, and no events, no logs, no pod conditions would have told you that. You had to go look for the policy yourself.

Why this happens

NetworkPolicy is additive in an interesting way: if no policy selects a pod, all traffic is allowed. If any policy selects a pod, then only the traffic explicitly allowed by all policies combined is permitted for the direction listed in policyTypes. So the moment you apply an empty egress policy that selects a pod, everything egress is denied unless you also add to rules. A lot of teams write this policy thinking "default deny" means "start from deny and then we layer allows on top", which is correct in intent but wrong in consequences, because they forget the allow-list layer.

The second trap is the kubelet health probe. The kubelet sends HTTP probes to the pod from the node's IP, which is not the pod network. An ingress policy that only allows traffic from podSelector in the same namespace will silently block the kubelet's probe, marking the pod Unhealthy and restarting it in a loop. The fix is an ingress rule allowing traffic from the node CIDR, or using exec probes instead of HTTP probes.

The third trap is DNS. A default-deny egress policy blocks traffic to kube-dns on port 53, which means every application call that uses a hostname fails before it even starts. Your allow-list needs an explicit rule allowing UDP and TCP to port 53 to the kube-system namespace, or nothing resolves.

The fix

◆ DAY 27 · NETWORK · NAMESPACE POLICY

One allow rule. Traffic from frontend now reaches backend.

Adding a namespaceSelector rule that matches the frontend namespace unblocks pod-a → pod-b. NetworkPolicy is allowlist-based: one explicit rule is all it takes. The deny-all baseline stays in place for every other source.

FIGURE27 / 35

pod-a issues the same request

Nothing changed on the source side. pod-a still runs curl http://pod-b:8080. The fix was entirely in the destination namespace policy.

One new allow rule — namespaceSelector

Adding from.namespaceSelector.matchLabels.name: frontend to the ingress rules in the backend NetworkPolicy is all that is required. First label the namespace: kubectl label namespace frontend name=frontend. The deny-all baseline stays in place for every other source.

Connection established — 200 OK

The CNI now matches the ingress rule and forwards the SYN to pod-b. Verify with kubectl exec -n frontend pod-a -- curl -s -o /dev/null -w "%{http_code}" http://pod-b.backend:8080.

A namespaceSelector allow rule in the backend NetworkPolicy lets pod-a reach pod-b on port 8080.

Two namespace cards side by side. Left card is namespace frontend containing pod-a. Right card is namespace backend containing pod-b and a PolicyStack with allow from ns frontend at the top, followed by allow from kube-system and deny from anywhere. A green solid arrow from pod-a to pod-b is labeled 8080/tcp with sublabel 200 OK.

NetworkPolicy v1 networking.k8s.io — kubectl explain networkpolicy.spec.ingress.from.namespaceSelector · Namespace labels — kubectl label namespace frontend name=frontend · kind v0.22.0, Kubernetes 1.30.0, Calico 3.27

bash

kubectl delete -f issue.yaml
kubectl apply -f fix.yaml

bash

The scenario fix removes the NetworkPolicy entirely. In a real cluster you do not want to delete the security policy, you want to fix it. The correct pattern is a default-deny policy paired with explicit allows for DNS, for kubelet probes, and for the actual traffic the app needs:

yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns-and-app
spec:
  podSelector:
    matchLabels:
      app: network-test
  policyTypes:
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53
  - to:
    - podSelector:
        matchLabels:
          app: api
    ports:
    - protocol: TCP
      port: 8080

yaml

Verify with a wget from the pod and with kubectl describe on the policy to make sure the rules render as you expect.

Day 27 of 35, tomorrow the cluster talks to itself perfectly but cannot reach the payment processor, and nothing in the cluster looks broken.

The scenario

pod-a issues the request

The backend NetworkPolicy is default-deny

pod-a sees a timeout, pod-b sees nothing

Reproduce the issue

Debug the hard way

Why this happens

The fix

pod-a issues the same request

One new allow rule — namespaceSelector

Connection established — 200 OK

Get the next post in your inbox.