🚀 1️⃣ Types of Deployment Strategies in EKS / Kubernetes

(Kubernetes supports rolling & recreate natively — canary & blue-green are implemented using patterns/tools.)

✅ Rolling Update (Default)

What it is: Pods are replaced gradually — new pods come up while old pods are terminated step by step.

How it works: Controlled by maxUnavailable and maxSurge.

Use case: Default for stateless microservices.

Pros:

zero downtime possible
simple
built-in
resource efficient

Cons:

mixed old/new versions during rollout
risky if backward compatibility not maintained

Interview line: Rolling update is my default for stateless services with backward-compatible releases.

🔵🟢 Blue-Green Deployment

What it is: Two full environments — blue (old) and green (new). Traffic switches all at once.

How in EKS: Two deployments + service selector switch or LB switch.

Use case:

high-risk fintech releases
schema-impacting changes
need instant rollback

Pros:

instant rollback
no mixed versions
clean validation

Cons:

double resource cost
environment sync needed

🐤 Canary Deployment

What it is: Release to small % of traffic first → observe → expand.

How in EKS:

Argo Rollouts
service mesh
weighted LB routing
ingress weighted rules

Use case:

critical APIs
user-facing flows
performance-risk releases

Pros:

lowest risk
metrics-driven rollout
early failure detection

Cons:

tooling needed
routing complexity

Senior interview line: For customer-facing fintech APIs, I prefer canary with automated metric checks.

💥 Recreate

What it is: All old pods killed → then new pods created.

Use case:

stateful singleton
batch job
incompatible versions

Cons:

downtime guaranteed
rarely used for APIs

🛡 2️⃣ Pod Disruption Budget (PDB)

✅ What It Is

PDB defines minimum pods that must remain available during voluntary disruptions.

✅ Protects Against

node drain
cluster upgrade
autoscaler scale-down
manual eviction

✅ Example Use Case

If service has 5 replicas and needs at least 3 alive:

minAvailable: 3

Upgrade won’t evict below 3.

⚠️ Important Interview Point

PDB does NOT protect against:

pod crash
node crash
OOM kill

Only voluntary disruptions.

❌ Common Mistake

PDB = replicas count → blocks upgrades completely.

🩺 3️⃣ Readiness vs Liveness Probe

Interviewers love this.

✅ Readiness Probe — “Can I receive traffic?”

If fails → pod removed from Service endpoints.

Use case:

app started but DB not ready
warmup phase
dependency checks

Does NOT restart pod.

✅ Liveness Probe — “Should I restart pod?”

If fails → kubelet restarts container.

Use case:

deadlock
stuck thread
hung process

🧠 Senior Interview Line

Readiness controls traffic flow; liveness controls container restart.

❌ Common Mistake

Using same endpoint for both → restart loops.

❤️ 4️⃣ Pod Affinity vs Anti-Affinity

✅ Pod Affinity

Schedule pods together.

Use case:

app + cache
tightly coupled services
low latency pairing

🚫 Pod Anti-Affinity

Schedule pods apart.

Use case:

spread replicas across nodes/AZ
high availability

Example: Don’t place same app replicas on same node.

⚠️ Tradeoff

Anti-affinity can cause unschedulable pods if cluster small.

🖥 5️⃣ Node Selector vs Node Affinity

✅ Node Selector (Simple)

Match exact label.

nodeSelector:
  disktype: ssd

Simple but rigid.

✅ Node Affinity (Advanced)

Supports:

required rules
preferred rules
expressions

More flexible.

🧠 Interview Line

Node selector is exact match; node affinity supports expressive scheduling logic.

☣️ 6️⃣ Taints & Tolerations

✅ What They Do

Taints repel pods. Tolerations allow pods onto tainted nodes.

✅ Use Cases

dedicated GPU nodes
system node pools
fintech sensitive workloads
spot node pools

Example: Only ML pods tolerate GPU taint.

⚠️ Senior Note

Taint = repel Affinity = attract

Say that — interviewers like it.

📈 7️⃣ Kubernetes Autoscaling Types

🔼 HPA — Horizontal Pod Autoscaler

✅ Scales

Number of pods.

Based on:

CPU
memory
custom metrics

Best for stateless services.

➕ Pros

fast
common
safe
works with CA/Karpenter

📏 VPA — Vertical Pod Autoscaler

✅ Scales

Pod resource requests/limits.

⚠️ Important

Usually restarts pods. Not used with HPA on same resource metric.

Best for:

stateful
batch
memory-bound apps

🧱 Cluster Autoscaler (or Karpenter)

✅ Scales

Nodes — not pods.

Adds/removes nodes based on pending pods.

✅ Works With

HPA → creates pods CA/Karpenter → adds nodes

🧠 Interview Line

HPA scales pods, Cluster Autoscaler scales nodes — they work together.

🧠 Senior One-Shot Summary Answer

If interviewer asks open:

For safe deployments I use rolling by default, canary or blue-green for high-risk releases. PDB protects availability during voluntary disruption. Readiness controls traffic, liveness controls restart. Affinity/anti-affinity and taints/tolerations control placement and isolation. For scaling, HPA handles pod count, VPA handles resource sizing, and Cluster Autoscaler or Karpenter handles node capacity.

Systemdesign Eks