🟢 Round 1 — Scenario-Based EKS / Kubernetes (Foundational–Practical)

✅ Q1 — Pods stuck in Pending

Question: Pods are stuck in Pending state after deployment. What do you check first?

Answer: Run kubectl describe pod <name> and check Events section. Usually shows reason like insufficient CPU/memory, node selector mismatch, taints, or PVC not bound. Then check kubectl get nodes resource availability. Most Pending issues are scheduling constraints.

✅ Q2 — CrashLoopBackOff

Question: Pod is in CrashLoopBackOff. What’s your step-by-step debug?

Answer: Check logs using kubectl logs <pod> --previous. Then describe pod for probe failures or OOMKilled. Verify env vars, config maps, secrets, and entrypoint command. 70% cases are bad config or app startup failure.

✅ Q3 — Service not reachable internally

Question: Pod cannot reach another service using service name.

Answer: Check service exists and selector matches pod labels. Run kubectl get svc and kubectl get endpoints. Test DNS using nslookup service-name inside pod. If endpoints empty → selector mismatch.

✅ Q4 — New version rollout broke app

Question: After deployment update, app started failing. Fastest recovery?

Answer: Run kubectl rollout undo deployment <name>. This reverts to previous ReplicaSet. Then inspect change diff and image tag. Never debug broken prod version live — rollback first.

✅ Q5 — HPA not scaling

Question: HPA configured but pods not scaling under load.

Answer: Check metrics server installed (kubectl top pods). Verify resource requests are defined — HPA needs them. Describe HPA to see metric status. Without requests, HPA calculations break.

✅ Q6 — Node shows NotReady

Question: One node is NotReady. What do you check?

Answer: Describe node and check conditions. SSH and check kubelet status. Most common causes: CNI failure, disk full, kubelet crash. Also check CloudWatch or systemd logs.

✅ Q7 — ImagePullBackOff

Question: Pod cannot pull image from ECR.

Answer: Check image name and tag first. Then verify node IAM role or IRSA has ECR pull permissions. For private repo — ensure imagePullSecret exists. Test with docker pull on node if needed.

✅ Q8 — ConfigMap updated but app not reflecting change

Question: Why?

Answer: ConfigMap changes don’t restart pods automatically. Need rollout restart or checksum annotation pattern in deployment. Mounted volumes update, env vars don’t.

✅ Q9 — LoadBalancer service pending EXTERNAL-IP

Question: Service type LoadBalancer shows pending.

Answer: Check AWS Load Balancer Controller or cloud provider integration. Verify subnets tagged for ELB. Also check service annotations and IAM permissions.

✅ Q10 — Pod using too much memory

Question: How do you confirm and control it?

Answer: Use kubectl top pod to confirm usage. Check limits in pod spec. If no limits — pod can consume node memory. Set memory limits and requests to prevent node pressure.

Karpenterandclusterautoscaler R2