🟢 Round 1 — Scenario-Based EKS / Kubernetes (Foundational–Practical)
✅ Q1 — Pods stuck in Pending
Question: Pods are stuck in Pending state after deployment. What do you check first?
Answer:
Run kubectl describe pod <name> and check Events section. Usually shows reason like insufficient CPU/memory, node selector mismatch, taints, or PVC not bound. Then check kubectl get nodes resource availability. Most Pending issues are scheduling constraints.
✅ Q2 — CrashLoopBackOff
Question: Pod is in CrashLoopBackOff. What’s your step-by-step debug?
Answer:
Check logs using kubectl logs <pod> --previous. Then describe pod for probe failures or OOMKilled. Verify env vars, config maps, secrets, and entrypoint command. 70% cases are bad config or app startup failure.
✅ Q3 — Service not reachable internally
Question: Pod cannot reach another service using service name.
Answer:
Check service exists and selector matches pod labels. Run kubectl get svc and kubectl get endpoints. Test DNS using nslookup service-name inside pod. If endpoints empty → selector mismatch.
✅ Q4 — New version rollout broke app
Question: After deployment update, app started failing. Fastest recovery?
Answer:
Run kubectl rollout undo deployment <name>. This reverts to previous ReplicaSet. Then inspect change diff and image tag. Never debug broken prod version live — rollback first.
✅ Q5 — HPA not scaling
Question: HPA configured but pods not scaling under load.
Answer:
Check metrics server installed (kubectl top pods). Verify resource requests are defined — HPA needs them. Describe HPA to see metric status. Without requests, HPA calculations break.
✅ Q6 — Node shows NotReady
Question: One node is NotReady. What do you check?
Answer: Describe node and check conditions. SSH and check kubelet status. Most common causes: CNI failure, disk full, kubelet crash. Also check CloudWatch or systemd logs.
✅ Q7 — ImagePullBackOff
Question: Pod cannot pull image from ECR.
Answer:
Check image name and tag first. Then verify node IAM role or IRSA has ECR pull permissions. For private repo — ensure imagePullSecret exists. Test with docker pull on node if needed.
✅ Q8 — ConfigMap updated but app not reflecting change
Question: Why?
Answer: ConfigMap changes don’t restart pods automatically. Need rollout restart or checksum annotation pattern in deployment. Mounted volumes update, env vars don’t.
✅ Q9 — LoadBalancer service pending EXTERNAL-IP
Question: Service type LoadBalancer shows pending.
Answer: Check AWS Load Balancer Controller or cloud provider integration. Verify subnets tagged for ELB. Also check service annotations and IAM permissions.
✅ Q10 — Pod using too much memory
Question: How do you confirm and control it?
Answer:
Use kubectl top pod to confirm usage. Check limits in pod spec. If no limits — pod can consume node memory. Set memory limits and requests to prevent node pressure.