EKS
Eks Ebs Pvcs

Good. Now we’re talking about something that actually breaks in production if you don’t understand it properly.

You’re on EKS, so this is AWS EBS-backed storage. Let’s go layer by layer, then we’ll apply it to real scenarios like MySQL, Kafka, Redis.


🔥 First: Core Concepts (Stop Memorizing, Understand the Flow)

When a Pod needs storage:

  1. Pod → requests PVC
  2. PVC → uses StorageClass
  3. StorageClass → provisions EBS volume
  4. EBS → becomes PV
  5. PV → attaches to node in specific AZ
  6. Pod → must run on a node in that same AZ

That AZ part is where most people screw up.


🧱 Components Explained (EKS Context)

1️⃣ StorageClass

Defines:

  • EBS type (gp3, io2, etc.)
  • Reclaim policy
  • VolumeBindingMode
  • Parameters like IOPS

Example (EKS default gp3):

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: gp3-sc
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
parameters:
  type: gp3
reclaimPolicy: Delete

Key setting:

🔥 volumeBindingMode

Two options:

Immediate (BAD for multi-AZ clusters)

Volume created immediately when PVC is created. Problem:

  • It picks a random AZ.
  • Pod might schedule in different AZ.
  • Pod stuck in Pending forever.

WaitForFirstConsumer (Correct for EKS)

Volume is created only after Pod is scheduled. This ensures:

  • Pod scheduled to a node in AZ X
  • EBS created in AZ X
  • Everything aligns

In EKS multi-AZ cluster → ALWAYS use WaitForFirstConsumer.


2️⃣ PVC (PersistentVolumeClaim)

App asks for storage:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-data
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: gp3-sc
  resources:
    requests:
      storage: 20Gi

ReadWriteOnce = can attach to only one node at a time (EBS limitation).


3️⃣ PV (PersistentVolume)

You usually don’t create manually. EBS CSI dynamically provisions it.

Represents actual EBS volume.


⚠️ CRITICAL: EBS Is AZ Bound

EBS volumes:

  • Exist in ONE AZ
  • Can attach to ONE node at a time
  • Cannot move across AZs

If your pod moves to another AZ → volume detach + attach required If AZ dies → your pod is dead

This matters massively for databases.


🔥 Scenario 1 — MySQL Primary + 3 Read Replicas (EKS StatefulSet)

Let’s design this properly.


Architecture

  • StatefulSet: mysql
  • Replicas: 4 (1 primary + 3 replicas)
  • Each pod needs its own volume
  • Each volume AZ-bound

Why StatefulSet?

Because:

  • Stable network identity
  • Stable PVC per pod
  • Ordered startup

StatefulSet VolumeClaimTemplate

volumeClaimTemplates:
  - metadata:
      name: mysql-data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: gp3-sc
      resources:
        requests:
          storage: 50Gi

This creates:

PodPVCEBS VolumeAZ
mysql-0mysql-data-mysql-0vol-xxxap-south-1a
mysql-1mysql-data-mysql-1vol-yyyap-south-1b
mysql-2mysql-data-mysql-2vol-zzzap-south-1c
mysql-3mysql-data-mysql-3vol-abcap-south-1a

Scheduler spreads pods across AZs.

Each pod gets its own EBS.


What Happens If Node Dies?

If node in 1a dies:

  • Kubernetes reschedules mysql-0
  • Must schedule in 1a
  • Because volume exists in 1a

If no node available in 1a → pod Pending forever

This is why:

  • You MUST have worker nodes in all AZs
  • You MUST use WaitForFirstConsumer

⚠️ Hard Truth: EKS + EBS Is NOT HA Across AZ

EBS does NOT replicate across AZ.

Your HA depends on:

  • MySQL replication
  • Not EBS replication

If AZ dies:

  • Primary in that AZ dies
  • You must promote replica in other AZ manually or via operator

🔥 Scenario 2 — 2 MySQL Masters

Now it gets tricky.

EBS still:

  • One volume per pod
  • One AZ per volume

Each master:

  • Has independent EBS
  • Replicates via MySQL clustering

Storage is NOT shared. It’s replicated at database layer.

Never try to share EBS between pods. It won’t work.


🔥 Scenario 3 — Kafka Cluster (3 Brokers)

Kafka absolutely requires:

  • One volume per broker
  • High IOPS
  • Consistent latency

You’d define:

storageClassName: gp3-sc
parameters:
  type: gp3

Better option:

  • Use io2 for production
  • Or gp3 with provisioned IOPS

Each broker:

  • Own EBS
  • AZ bound
  • Kafka handles replication

Again: Storage is local per broker. Replication happens at Kafka level.

If broker in AZ A dies:

  • Partition replicas in B and C take over

But: If entire AZ dies → you lose that broker’s data copy

That’s why Kafka replication factor = 3 minimum.


🔥 Scenario 4 — Redis

If it’s:

Standalone Redis:

Single pod + EBS

Bad HA story.

Redis Sentinel / Cluster:

Each pod:

  • Own PVC
  • Own EBS
  • Replication at Redis level

Same AZ constraints apply.


🔥 Very Important: Pod Rescheduling Behavior

When a pod restarts:

  1. Kubernetes finds existing PVC
  2. PVC bound to specific PV
  3. PV bound to EBS
  4. EBS tied to AZ
  5. Pod must land in same AZ

Scheduler enforces topology constraint automatically.

If you don’t have node capacity in that AZ: Pod stuck forever.

That’s why: AutoScaling Groups must exist in all AZs.


🔥 Reclaim Policy

In StorageClass:

Delete

When PVC deleted → EBS deleted

Dangerous for production DB.

Retain

PVC deleted → EBS remains

Better for production databases.


🔥 When EBS Is the Wrong Choice

EBS:

  • Single AZ
  • RWO only
  • Block storage

If you need:

  • Shared filesystem
  • Multi-pod write
  • Cross-AZ mounting

Use:

  • EFS (RWX)
  • FSx
  • S3 (object storage)

🔥 Production Advice (No Sugarcoating)

If you're running:

  • Production MySQL
  • Production Kafka
  • Production Redis

On EKS with raw StatefulSets…

You’re reinventing the wheel.

Better:

  • Use RDS for MySQL
  • Use MSK for Kafka
  • Use ElastiCache for Redis

Unless:

  • You absolutely need Kubernetes-native control
  • You accept operational complexity

Running stateful workloads in EKS with EBS is doable. But you need to understand failure domains deeply.


🎯 Summary

ComponentPurpose
StorageClassDefines how volume is created
PVCApp request for storage
PVActual volume object
EBSReal AWS disk
WaitForFirstConsumerPrevents AZ mismatch
RWOOnly 1 node attach

If you want next level:

I can explain:

  • What happens during node drain
  • How volume detach/attach timing works
  • How topologySpreadConstraints interact with storage
  • How to design truly HA MySQL on EKS
  • Or how to debug “pod stuck in Pending due to volume node affinity conflict”

Tell me which direction you want.


💬 Need a Quick Summary?

Hey! Don't have time to read everything? I get it. 😊
Click below and I'll give you the main points and what matters most on this page.
Takes about 5 seconds • Uses Perplexity AI