Same image, same manifest, same namespace. Staging runs fine. Prod throws open: permission denied on a file that clearly exists, clearly has the right ownership, clearly passed ls -la with 0644. id inside the container shows the right UID. stat shows the file. And still the app crashes on startup. I spend two hours thinking it is a Kubernetes bug. It is not a Kubernetes bug. It is SELinux enforcing a policy on the prod nodes that staging did not have enabled, and the container's SELinux context does not match the file's SELinux context. Kubernetes has no idea any of this is happening. kubectl describe pod says Running. The kernel says nope.
The scenario
DAC bits say 777. SELinux still says no.
A pod writes to /var/lib/data mounted from a hostPath. The host is in SELinux enforcing mode. The container's label is container_file_t; the host directory's label is default_t. The kernel LSM hook fires after the DAC check passes and returns EACCES. The app logs 'permission denied' on a path that looks completely open.
The pod looks fine from Kubernetes' perspective
kubectl describe pod says Running. ls -la inside the container shows the mount with correct mode bits. id shows uid 1000. The DAC layer is satisfied. SELinux enforces a second, independent permission layer that Kubernetes has no visibility into.
Two checks run, in order — DAC then MAC
The kernel runs the Discretionary Access Control (DAC) check first. It passes — uid 1000 has execute on the directory. The LSM hook fires next: selinux_inode_permission compares process label to inode label. No type transition rule covers container_file_t → default_t, so the AVC denial fires and the kernel returns EACCES.
Fix the label, not the mode bits
Run ausearch -m avc -ts recent on the node to find the exact denial. The fix is either chcon -Rt container_file_t /var/lib/data on the host, or set securityContext.seLinuxOptions.type: container_file_t in the pod spec. Never run audit2allow blind — read what you are permitting first.
The repo has a reproducible version with a mismatched SELinux level.
git clone https://github.com/vellankikoti/troubleshoot-kubernetes-like-a-pro.git
cd troubleshoot-kubernetes-like-a-pro/scenarios/selinux-apparmor-policy-violation
lsdescription.md, issue.yaml, fix.yaml. The issue manifest sets seLinuxOptions.level: "s0:c0,c1" on the pod, simulating a container that gets labelled with a category set the node's policy does not allow it to use.
Reproduce the issue
kubectl apply -f issue.yaml
kubectl get pod selinux-apparmor-issue-podNAME READY STATUS RESTARTS AGE
selinux-apparmor-issue-pod 1/1 Running 0 3mAnd the logs:
kubectl logs selinux-apparmor-issue-podopen /data/config.json: permission deniedKubernetes reports the pod as healthy. The app reports that a file it owns is unreadable. Both are telling the truth from where they sit.
Debug the hard way
First, check the SELinux context Kubernetes actually applied to the pod.
kubectl get pod selinux-apparmor-issue-pod -o jsonpath='{.spec.securityContext}'{"seLinuxOptions":{"level":"s0:c0,c1"}}Now exec in and check the runtime view.
kubectl exec selinux-apparmor-issue-pod -- id -Zsystem_u:system_r:container_t:s0:c0,c1And check the file the app is trying to open.
kubectl exec selinux-apparmor-issue-pod -- ls -lZ /data/config.json-rw-r--r--. 1 root root system_u:object_r:container_file_t:s0:c0,c0 42 Apr 19 22:14 /data/config.jsonThere it is. Process context is s0:c0,c1. File context is s0:c0,c0. DAC permissions (rw-r--r--) would let the process read the file. MAC permissions (SELinux) say no, because the categories do not match. To see the denial in the kernel log I would normally ssh to the node and tail journalctl.
ssh node-7 'journalctl -k | grep AVC | tail -3'audit: type=1400 audit(...): avc: denied { read } for pid=12345 comm="busybox"
name="config.json" scontext=system_u:system_r:container_t:s0:c0,c1
tcontext=system_u:object_r:container_file_t:s0:c0,c0 tclass=file permissive=0There is the AVC denial in black and white. scontext is the subject (the process). tcontext is the target (the file). The mismatch is the c0,c1 vs c0,c0 category set. The kernel denied the read and returned EACCES, which the container saw as plain old permission denied.
Why this happens
SELinux and AppArmor are Linux Security Modules that sit below the regular Unix permission model. Even if DAC says "this UID can read this file," the LSM gets a second vote and can refuse. In Kubernetes land, this matters because the container runtime labels your container with an SELinux context, and the files on the node (hostPath, emptyDir, local PVs) have their own context. If they do not match, the kernel refuses the syscall and there is no error anywhere that Kubernetes-level tooling will show you.
The way this usually bites teams in prod is an environment skew. Staging nodes have SELinux in permissive mode. Prod nodes have it in enforcing. The staging cluster silently logs denials and keeps going. The prod cluster hard-fails. Your manifest is identical. Your logs disagree. The second common cause is a pod that sets seLinuxOptions explicitly, maybe copied from a blog post, and the level it asks for is not one the node's policy knows about.
The fix is usually to adjust the pod's context to match the files it needs, or to let the runtime assign the default context and not override it at all. The hard version is writing a custom SELinux policy with audit2allow, and that is a loop you should never run blindly, because audit2allow -a will gladly generate a rule that allows the exact denial you just saw plus a dozen things you did not see and did not want to allow. Always read the generated TE file before loading it.
The fix
kubectl delete pod selinux-apparmor-issue-pod
kubectl apply -f fix.yamlThe diff is a single category.
securityContext:
seLinuxOptions:
- level: "s0:c0,c1"
+ level: "s0:c0,c0"Verify the process and file contexts now match.
kubectl exec selinux-apparmor-fixed-pod -- id -Zsystem_u:system_r:container_t:s0:c0,c0In real life, unless you know exactly what the policy on the node looks like, do not set seLinuxOptions at all. Let the runtime pick. For volumes that need a specific context, use the seLinuxOptions at the volume level and let the runtime relabel with the Z option. For AppArmor, use the container.apparmor.security.beta.kubernetes.io/<container> annotation and point it at a profile that actually exists on every node.
The lesson
- "Permission denied" on a file that looks readable almost always means an LSM is saying no below DAC. Check
id -Zandls -Zbefore you check ownership. - Do not set
seLinuxOptionsunless you know the node's policy. The runtime default is almost always right. audit2allow -agenerates rules for every denial it sees, including ones you did not want to allow. Read the TE file every time.
Day 32 of 35 — tomorrow we go one layer deeper, into the CRI itself, where the pod sandbox fails before your image even gets a chance to pull.
