Why is persistence important for stateful applications in Kubernetes? #
- Pod Storage is Ephemeral: Data inside a Pod is lost when the Pod is restarted, deleted, or moved to another node
- Retain Important Data Outside the Pod: Store databases, logs, and user files in persistent storage to avoid data loss
- Needed for Stateful Apps: Required for apps like MySQL, PostgreSQL, and Kafka that depend on consistent storage
- Supports Backups and Disaster Recovery: Supports snapshots and recovery strategies to bring back lost or corrupted data
- Ensures High Availability and Reliability: Keeps data safe during failures
What is the need for Volumes in Kubernetes? #
- Why Use Kubernetes Volumes?: They allow containers in a Pod to access and share data using a persistent filesystem
- Persist Container Data: Avoids data loss when containers crash or restart by keeping data outside the container
- Support Multi-Container Pods: Lets containers within the same Pod share files easily using a common volume
- Enable Shared Storage Across Pods: Volumes like NFS help share data between Pods, even on different nodes
- Use Configs and Secrets as Files: Automatically load configuration files or credentials using ConfigMaps and Secrets
- Provide Temporary Working Space: Use emptyDir volumes for scratch space that lives as long as the Pod does
- Mount Read-Only Data: Share static content from a container image without modifying it
Give examples of different types of volumes in Kubernetes #
- emptyDir: Temporary volume that provides scratch space for a Pod; data is deleted when the Pod is removed from the node
- hostPath: Mounts a file or directory from the host node’s filesystem into the Pod
- nfs: Mounts a shared NFS (Network File System) volume; allows multiple Pods across nodes to access the same files
- configMap: Mount configuration values from key-value pairs directly into the container as files
- persistentVolumeClaim: Claim for persistent storage. Kubernetes matches PVC with the right storage!
# Sample Kubernetes Pod using different volume types
apiVersion: v1
kind: Pod
metadata:
name: volume-demo-pod
spec:
containers:
- name: app-container
image: busybox
command: [ "sleep", "3600" ]
# Mount ConfigMap as files
volumeMounts:
- name: config-vol
mountPath: /etc/config
# Mount temporary scratch space
- name: temp-vol
mountPath: /tmp/scratch
# Mount host path directory
- name: host-vol
mountPath: /mnt/host
# Mount NFS volume
- name: nfs-vol
mountPath: /mnt/nfs
# Mount persistent volume claim
- name: pvc-vol
mountPath: /mnt/data
volumes:
# 1. configMap volume
- name: config-vol
configMap:
name: app-config
# ConfigMap must exist in the same namespace
# 2. emptyDir volume (lives as long as the Pod)
- name: temp-vol
emptyDir: {} # Default: node's disk is used
# Optionally use memory:
# emptyDir:
# medium: Memory
# 3. hostPath volume
- name: host-vol
hostPath:
path: /var/log
type: Directory
# 4. nfs volume
- name: nfs-vol
nfs:
server: 10.0.0.12
path: /shared-data
# NFS server must allow client access
# 5. persistentVolumeClaim
- name: pvc-vol
persistentVolumeClaim:
claimName: data-pvc
# The PVC must exist before Pod creation
---
# Sample ConfigMap (used above)
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
app.properties: |
mode=production
max_connections=50
---
# Sample PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: standard
What is Static Volume Provisioning? #
- Define PersistentVolumes Manually: In static provisioning, cluster admins pre-create PersistentVolumes (PVs) with fixed size, storage type, and access modes
- Claim Storage with PVCs: Users create PersistentVolumeClaims (PVCs) to request storage; Kubernetes matches the PVC to a suitable pre-created PV
- Use Matching Rules for Binding: Kubernetes binds PVC to PV based on criteria like storage size, access mode, etc.
- No Automation for Volume Creation: Unlike dynamic provisioning, the PV must already exist—Kubernetes does not create the volume automatically
- Suitable for Controlled Environments: Best used when storage is managed externally (e.g., existing NFS shares) and admins want full control over provisioning
# Static Volume Provisioning Example
# ----------------------------------
# Admin creates the PersistentVolume (PV) manually
# User creates a PersistentVolumeClaim (PVC)
# Kubernetes binds PVC to a matching PV
# Step 1: PersistentVolume (PV)
apiVersion: v1
kind: PersistentVolume
metadata:
name: static-pv
spec:
capacity:
storage: 1Gi # Size of volume
accessModes:
- ReadWriteOnce # One node read/write
persistentVolumeReclaimPolicy: Retain
storageClassName: manual
hostPath:
path: "/mnt/data"
type: DirectoryOrCreate
# For real use, replace with:
# nfs:
# server: 10.0.0.10
# path: /shared-nfs-path
# ReclaimPolicy options:
# - Retain: Keep data after PVC deleted
# - Delete: Remove volume with PVC
---
# Step 2: PersistentVolumeClaim (PVC)
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: static-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: manual
# Must match storageClassName in PV
---
# Step 3: Pod using the PVC
apiVersion: v1
kind: Pod
metadata:
name: pvc-pod
spec:
containers:
- name: app
image: busybox
command: [ "sleep", "3600" ]
volumeMounts:
- mountPath: "/usr/share/data"
name: storage-vol
volumes:
- name: storage-vol
persistentVolumeClaim:
claimName: static-pvc
What is Dynamic Volume Provisioning? #
- Provision Storage Automatically: Dynamic provisioning creates storage volumes on demand when users create PersistentVolumeClaims
- Avoid Manual Volume Creation: Eliminates the need for admins to manually create PersistentVolumes or interact with storage providers
- Use StorageClass for Provisioning Rules: Admins define StorageClass objects that specify which provisioner to use and its parameters
- Offer Multiple Storage Flavors: Allows clusters to support different types of storage (e.g., fast SSD, standard HDD, cloud volumes) using different StorageClass configurations
- Simplify Storage for Users: Users only need to create a PVC; Kubernetes handles the rest using the defined StorageClass configuration
# Dynamic Volume Provisioning Example
# -----------------------------------
# Admin creates StorageClass once
# Users create PVCs referencing the StorageClass
# Kubernetes dynamically provisions a PV
# Step 1: StorageClass (admin-created using CSI provisioner)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-storage
provisioner: ebs.csi.aws.com # ✅ Recommended driver for AWS EBS
# For other providers:
# - GCP: pd.csi.storage.gke.io
# - Azure: disk.csi.azure.com
parameters:
type: gp3 # AWS EBS volume type
fsType: ext4 # Filesystem to use
reclaimPolicy: Delete
# - Retain: keep data after PVC is deleted
# - Delete: auto-remove volume with PVC
# - Recycle: legacy; not supported by most CSI drivers
volumeBindingMode: WaitForFirstConsumer
# - Immediate: bind volume as soon as PVC is created
# - WaitForFirstConsumer: bind only when Pod is scheduled
---
# Step 2: PersistentVolumeClaim (user-created)
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: dynamic-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: fast-storage
# Must match StorageClass name
---
# Step 3: Pod using the PVC
apiVersion: v1
kind: Pod
metadata:
name: dynamic-pod
spec:
containers:
- name: app
image: busybox
command: [ "sleep", "3600" ]
volumeMounts:
- mountPath: "/data"
name: dynamic-vol
volumes:
- name: dynamic-vol
persistentVolumeClaim:
claimName: dynamic-pvc
Kubernetes Object - PersistentVolumeClaim (PVC) #
- Request Storage: Allows users to request specific storage resources without needing to know the underlying storage details
- Works with Both Static and Dynamic Provisioning: PVCs are flexible and can bind to pre-created PersistentVolumes (static) or trigger automatic provisioning (dynamic) via a StorageClass
# WHAT: Requests persistent storage from the cluster
# WHY: Allows an app to store logs that survive
# restarts or rescheduling of Pods
# WHEN: Use when data (like logs) must be retained
# WHERE: Defined in the same namespace as the Pod
# HOW: Matches with a compatible StorageClass or PV
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: log-pvc
# PVC name used by the Pod to reference this storage
spec:
accessModes:
- ReadWriteOnce
# WHAT: Access policy for the volume
# - ReadWriteOnce: only one node can write
# - ReadOnlyMany: multiple nodes can read
# - ReadWriteMany: multiple nodes can read/write
resources:
requests:
storage: 1Gi
# WHAT: Storage size requested (min guaranteed)
# WHY: App needs this much space for logs
# HOW: Cluster finds/provisions matching volume
storageClassName: fast-ssd
# WHAT: Tells Kubernetes which StorageClass to use
# WHY: Enables dynamic provisioning of the volume
# VARIATION:
# - Omit this field to use the default StorageClass
#
# storageClassName: manual
# Bind to Pre Created PersistentVolume (Static Provisioning)
# Ensure storageClassName of PersistentVolume is manual
What is the role of a StorageClass in dynamic provisioning? #
- Dynamic Provisioning: Defines how storage is dynamically provisioned, abstracting storage backend details
- Reusability: Enables multiple PVCs to use the same StorageClass for consistent storage provisioning
# WHAT: Defines a class for dynamic volume creation
# WHY: PVCs use it to auto-provision storage
# WHERE: Cluster-wide resource (not namespaced)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
annotations:
storageclass.kubernetes.io/is-default-class: "true"
# Makes this the default StorageClass
# PVCs without storageClassName will use this one
# Only one default StorageClass is allowed per cluster
provisioner: ebs.csi.aws.com
# ✅ CSI-based AWS EBS provisioner (recommended)
# For GCP: pd.csi.storage.gke.io
# For Azure: disk.csi.azure.com
parameters:
type: gp3
# AWS EBS volume type
# GCP: pd-standard, pd-ssd
# Azure: Standard_LRS, Premium_LRS
fsType: ext4
# File system to format the volume with
# Options: ext4 (default), xfs, btrfs, ntfs (depends on OS image)
reclaimPolicy: Delete
# WHAT: Behavior after PVC is deleted
# WHY: To automatically delete the volume and free up resources
# OPTIONS:
# - Retain: keep volume and data after PVC deletion
# - Delete: delete volume (typical for cloud-managed disks)
# - Recycle: deprecated legacy mode (not supported in CSI)
volumeBindingMode: WaitForFirstConsumer
# WHAT: Controls when volume is bound to the node
# WHY: Ensures zone-aware volume placement (especially in multi-zone clusters)
# OPTIONS:
# - Immediate: create volume as soon as PVC is created
# - WaitForFirstConsumer: create volume only after Pod is scheduled (recommended)
PV, PVC and StorageClass - Troubleshooting #
Scenario: What happens if a PVC requests more storage than any available PV offers?
- PVC Stays Unbound: Kubernetes can't find a matching PV with sufficient capacity, so the claim remains in
Pending
state - Pod Stuck in Pending: If a Pod depends on the PVC, it will also stay in
Pending
until the claim is resolved - No Auto-Resize or Retry: Kubernetes doesn't auto-adjust existing PVs
- Fix by Provisioning a Larger PV: Admin must create a compatible PV or enable dynamic provisioning via
StorageClass
- Verify with
kubectl describe pvc
: Check for capacity mismatch and related events for troubleshooting
Scenario: You’ve defined a PVC and deployed a pod using it, but the pod is stuck in Pending. What could be the issue and how do you debug it?
- No Available PV: There may be no static PV available with matching specs, or dynamic provisioning may not be enabled
- Missing StorageClass: If the PVC references a
StorageClass
that doesn’t exist, provisioning fails silently - Check Events and Describe: Use
kubectl describe pod <pod-name>
andkubectl describe pvc <pvc-name>
to inspect reasons
Why would you prefer to use StatefulSet over a Deployment for running a distributed message platform like Kafka? #
- Kafka is a Distributed Messaging Platform: Apache Kafka is used for building real-time event-driven systems
- Each Broker Has a Persistent Identity: A Kafka broker is a server that stores data and serves client requests. Each broker in a Kafka cluster has a unique ID and hostname and maintains local storage for topics and partitions.
- Kafka Brokers Need Stable Identity: Each Kafka broker registers itself with a unique ID and hostname in the cluster
- Stable DNS Is Critical for Communication: Kafka uses broker hostnames (like
kafka-0.kafka-headless.default.svc.cluster.local
) for internal communication and client access - Clients and Brokers Rely on Fixed Hostnames: Without consistent DNS names, clients can't reconnect and brokers can't find each other
- Deployment Fails to Guarantee DNS Stability: Pods created by a Deployment get random names (e.g.,
kafka-xyz123
), breaking Kafka's expectations - StatefulSet Solves These Problems: It gives each Pod a fixed hostname and persistent volume (e.g.,
kafka-0
always getspv-kafka-0
) - Ensures Predictable, Reliable Cluster Behavior: Required for stable broker identity, safe storage, and smooth recovery in Kafka deployments
Can you explain a StatefulSet with an example? #
- Stable Identity: Assigns each Pod a unique, persistent identity that remains consistent across rescheduling
- Persistent Storage: Ensures each Pod has its own
PersistentVolume
, maintaining data across restarts - Ordered Deployment: Deploys and scales Pods in a specific order(pod-0, pod-1, …), crucial for certain stateful applications
- Headless Service: Often used with headless Services to manage network identities of Pods
# WHAT: Defines a StatefulSet for a Kafka-like app
# WHY: Each Pod needs a stable name and its own volume
# WHEN: Use when app relies on identity and persistent state
# WHERE: Pods get hostnames like kafka-0, kafka-1, etc.
# HOW: Each Pod gets a PVC and starts in order
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: kafka
spec:
serviceName: kafka-headless
replicas: 3
selector:
matchLabels:
app: kafka
template:
metadata:
labels:
app: kafka
spec:
containers:
- name: kafka
image: bitnami/kafka:latest
ports:
- containerPort: 9092
volumeMounts:
- name: kafka-storage
mountPath: /bitnami/kafka
volumeClaimTemplates:
- metadata:
name: kafka-storage
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
storageClassName: fast-ssd
---
# WHAT: Headless service for stable Pod DNS
# WHY: Allows kafka-0, kafka-1, etc. to resolve by name
# HOW: Used with StatefulSet for stable networking
apiVersion: v1
kind: Service
metadata:
name: kafka-headless
spec:
# Disables allocation of a cluster IP for the service.
# Creates DNS records for individual Pods
# Allows direct pod-to-pod communication
# based on their DNS names
clusterIP: None
selector:
app: kafka
ports:
- port: 9092
name: kafka
- Each Pod is Named Predictably:
kafka-0
,kafka-1
,kafka-2
- Each Pod Gets a Volume:
kafka-storage-kafka-0
, etc. - Stable DNS with Headless Service: Required for Kafka to connect broker-to-broker
- Startup and Shutdown Are Ordered: Helps Kafka avoid cluster confusion or split-brain
How is a Secret different from a ConfigMap, and how are they mounted in a pod? #
- Separate Sensitive from Non-Sensitive Data: Use Secrets for passwords, tokens, and keys; use ConfigMaps for configs like URLs and log levels
- Encoded vs Plain Storage: Secrets are base64-encoded and treated as sensitive; ConfigMaps store plain text
- Access-Controlled and Safer: Secrets are stored in etcd with stricter access rules; safer for confidential info
- Same Mounting Options: Both can be mounted as files or exposed as environment variables in Pods
- Example Use Case: Use a Secret for a database password and a ConfigMap for the database hostname in the same Pod
# WHAT: Stores sensitive data securely
# WHY: Used for passwords, tokens, keys
# HOW: Encoded in base64; access-controlled
# WHERE: Created in the Pod's namespace
apiVersion: v1
kind: Secret
metadata:
name: db-secret
type: Opaque
data:
db-password: cGFzc3dvcmQxMjM=
# "password123" base64-encoded
# echo -n "password123" | base64
---
# WHAT: Stores non-sensitive app config
# WHY: Used for URLs, ports, feature flags
# HOW: Stored as plain text key-value pairs
# WHERE: Created in the Pod's namespace
apiVersion: v1
kind: ConfigMap
metadata:
name: db-config
data:
db-host: my-database.svc.cluster.local
db-port: "5432"
db-name: myappdb
db-user: appuser
---
# WHAT: Pod that uses Secret + ConfigMap
# WHY: Mounts password + host into container
# WHEN: App reads them from /app/config/
# WHERE: Values mounted as individual files
apiVersion: v1
kind: Pod
metadata:
name: db-client
spec:
containers:
- name: app
image: busybox
command:
- /bin/sh
- -c
- >
echo DB_HOST=$(cat /app/config/db-host);
echo DB_PASSWORD=$(cat /app/config/db-password);
sleep 3600;
volumeMounts:
- name: db-host-volume
mountPath: /app/config/db-host
subPath: db-host
- name: db-pass-volume
mountPath: /app/config/db-password
subPath: db-password
volumes:
- name: db-host-volume
configMap:
name: db-config
- name: db-pass-volume
secret:
secretName: db-secret
# ------------------------------------------
# VARIATION: Use env instead of file mounts
# env:
# - name: DB_HOST
# valueFrom:
# configMapKeyRef:
# name: db-config
# key: db-host
# - name: DB_PASSWORD
# valueFrom:
# secretKeyRef:
# name: db-secret
# key: db-password
# ------------------------------------------
#
Why are changes to ConfigMaps and Secrets not visible immediately? #
- Not Live Reloaded by Default : Updates to ConfigMaps and Secrets are not immediately reflected in running Pods
- No Built-in Watch Mechanism: Kubernetes does not trigger Pod updates automatically on changes to ConfigMaps or Secrets
- Restart Needed for Environment Variables: If used as environment variables, changes only apply after Pod restart
- Volume Mounts Sync with Delay: When mounted as files, updates are polled by kubelet (typically every 1–2 minutes) and not delivered instantly
- Use External Tools for Auto-Restart: Tools like Reloader or Kustomize annotations can automate Pod restarts on changes