EKS
Karpenterandclusterautoscaler

πŸš€ Karpenter β€” Deep Interview Guide (EKS)

πŸ”Ή What is Karpenter?

Karpenter is a Kubernetes-native node provisioning system for EKS that launches nodes directly based on pod requirements. It watches unschedulable pods and creates right-sized EC2 instances instantly. It does not depend on Auto Scaling Groups. It is faster and more flexible than Cluster Autoscaler.


πŸ”Ή Why Karpenter Was Created

Cluster Autoscaler scales node groups β€” not pods directly. That causes delays and waste. Karpenter instead provisions exact instance types based on CPU/memory/constraints needed by pending pods. This reduces cost and startup time.


πŸ”Ή How Karpenter Works (Real Flow)

  1. Pod becomes unschedulable
  2. Karpenter watches scheduler events
  3. It calculates required resources
  4. Selects best EC2 instance type from allowed list
  5. Launches instance directly via EC2 API
  6. Node joins cluster β†’ pod scheduled

No ASG involved in decision loop.


πŸ”Ή Karpenter Core Objects

Provisioner / NodePool (newer API) Defines constraints like:

  • instance types
  • spot/on-demand
  • zones
  • CPU/memory limits
  • labels/taints

NodeClass (AWSNodeClass) Defines infra config:

  • subnets
  • security groups
  • AMI family
  • IAM role

πŸ”Ή Karpenter Production Use Cases

  • Spiky workloads
  • Mixed instance fleets
  • Spot-heavy clusters
  • Batch + ML workloads
  • Cost-optimized scaling
  • Fast burst scaling

Bad fit: very static predictable workloads.


πŸ”Ή Karpenter Cost Optimization Features

  • Instance type flexibility
  • Spot-first strategies
  • Right-sized nodes
  • Consolidation (removes underutilized nodes)
  • Bin-packing pods efficiently

Cluster Autoscaler cannot do this level of packing.


πŸ”Ή Karpenter Risks / Operational Notes

  • Needs correct constraints or it may launch expensive instances
  • Must define limits in Provisioner
  • Spot interruption handling must be configured
  • More powerful = more dangerous if misconfigured

βš–οΈ Cluster Autoscaler β€” Deep Interview Guide

Image

Image

Image

Image


πŸ”Ή What is Cluster Autoscaler?

Cluster Autoscaler scales existing node groups (ASGs) when pods are unschedulable. It increases or decreases ASG desired count. It does not create new instance types dynamically.


πŸ”Ή How Cluster Autoscaler Works

  1. Detects unschedulable pods
  2. Simulates scheduling against each node group
  3. Picks matching group
  4. Increases ASG desired capacity
  5. Waits for node to join

Scaling speed depends on ASG + launch template.


πŸ”Ή Cluster Autoscaler Strengths

  • Simple model
  • Stable and mature
  • Easy to reason about
  • Good for predictable workloads
  • Tight ASG integration

πŸ”Ή Cluster Autoscaler Limitations

  • Only scales predefined node groups
  • Cannot choose new instance types dynamically
  • Slower reaction time
  • More wasted capacity
  • Harder spot optimization

πŸ₯Š Karpenter vs Cluster Autoscaler (Interview Table)

AreaKarpenterCluster Autoscaler
ProvisioningDirect EC2 launchASG scale
SpeedFasterSlower
Instance choiceDynamicFixed per node group
Cost optimizationStrongLimited
Spot handlingAdvancedBasic
ComplexityHigherLower
Best forDynamic workloadsStable workloads

Interview punchline: CA scales groups. Karpenter scales nodes.


🧱 EKS Node Groups β€” Full Interview Guide

Image

Image

Image

Image


πŸ”Ή What is an EKS Node Group?

A node group is a set of worker nodes managed together. Backed by an Auto Scaling Group. All nodes share config like AMI, instance type, IAM role.


πŸ”Ή Types of Node Groups

βœ… Managed Node Group

AWS manages lifecycle and upgrades. Easier operations. Recommended default.

βœ… Self-Managed Node Group

You manage ASG + AMI + bootstrap. More control, more work.

βœ… Fargate Profile

Serverless β€” no nodes. Pod-level execution.


πŸ”Ή Managed Node Group Features

  • Automated AMI updates
  • Rolling node upgrades
  • Integrated with EKS APIs
  • Health checks
  • Easier lifecycle management

πŸ”Ή When Use Multiple Node Groups

  • Spot vs On-Demand split
  • GPU workloads
  • Memory-heavy apps
  • Team isolation
  • Different taints/labels
  • Different instance families

πŸ”Ή Spot Node Groups Best Practice

Use mixed instance ASG. Add taints to spot nodes. Add tolerations only to safe workloads. Never let critical pods land blindly on spot.


πŸ”Ή Node Group Upgrade Best Practice

Never in-place upgrade production nodes blindly. Create new node group β†’ cordon/drain old β†’ migrate β†’ delete old. Blue/green node group strategy is safest.


πŸ”Ή Node Group vs Karpenter Mental Model

Node group = fixed capacity pool Karpenter = on-demand capacity builder

Node groups are supply-driven. Karpenter is demand-driven.


Perfect β€” here are 10 high-value Karpenter interview questions with 3–5 line practical answers. These are the ones actually asked in senior DevOps / EKS interviews β€” not marketing-level stuff.


πŸš€ Karpenter β€” Top 10 Important Interview Questions


βœ… 1 β€” What problem does Karpenter solve in EKS?

Karpenter solves slow and inefficient node scaling in Kubernetes by provisioning nodes directly based on pending pod requirements. Instead of scaling fixed node groups, it launches right-sized EC2 instances on demand. This reduces scheduling delay and unused capacity. It is designed for dynamic, bursty workloads.


βœ… 2 β€” How is Karpenter different from Cluster Autoscaler at core level?

Cluster Autoscaler scales Auto Scaling Groups. Karpenter launches EC2 instances directly via AWS APIs without depending on ASGs. CA chooses from predefined node groups, while Karpenter selects instance types dynamically. Karpenter is demand-driven; CA is group-driven.


βœ… 3 β€” How does Karpenter decide which instance type to launch?

Karpenter reads unschedulable pod resource requests and constraints. It evaluates CPU, memory, architecture, zones, capacity type (spot/on-demand), and allowed instance families. Then it picks the most efficient instance that fits. This is done using Provisioner/NodePool constraints.


βœ… 4 β€” What are the main Karpenter custom resources?

Main objects are NodePool (or Provisioner in older versions) and NodeClass (AWSNodeClass). NodePool defines scheduling and capacity rules like instance types, zones, limits, taints. NodeClass defines AWS infra config like subnets, security groups, AMI, IAM role. Together they control how nodes are created.


βœ… 5 β€” How does Karpenter handle spot instances in production?

Karpenter supports spot-first provisioning with fallback to on-demand. It can choose from multiple spot instance types to reduce interruption risk. It integrates with interruption notices and drains nodes before termination. You should still use PDBs and priorities for safety.


βœ… 6 β€” What is Karpenter consolidation?

Consolidation is a cost optimization feature where Karpenter removes underutilized nodes. It reschedules pods onto fewer nodes when possible. This reduces cloud cost automatically. It works best when pod requests are properly defined.


βœ… 7 β€” What are common misconfigurations that make Karpenter dangerous?

Not setting resource limits in NodePool can lead to unlimited node creation. Allowing all instance types may launch very expensive machines. Missing taints can mix critical and batch workloads. No budget or disruption controls can cause aggressive node churn.


βœ… 8 β€” How does Karpenter interact with Kubernetes scheduler?

Karpenter does not replace the scheduler. Scheduler first tries to place pods and marks them unschedulable. Karpenter watches these events and provisions capacity. Once node joins, scheduler places the pod normally.


βœ… 9 β€” When should you NOT use Karpenter?

Avoid Karpenter if workloads are very stable and predictable β€” node groups are simpler there. Also avoid when strict instance standardization is required. Teams without strong Kubernetes control may misconfigure it and cause cost spikes.


βœ… 10 β€” Can Karpenter and Cluster Autoscaler run together?

Technically yes, but usually not recommended for same workloads. They can conflict in scaling behavior. Some teams keep CA for fixed system node groups and use Karpenter for dynamic workloads. Clear separation is required.



πŸ’¬ Need a Quick Summary?

Hey! Don't have time to read everything? I get it. 😊
Click below and I'll give you the main points and what matters most on this page.
Takes about 5 seconds β€’ Uses Perplexity AI