What is CRI? #
- Understanding the Need for CRI (Container Runtime Interface): Kubernetes needs to run containers, but there are many container runtimes (Docker, containerd, CRI-O). Kubernetes needed a consistent way to talk to any container runtime.
- What is CRI: CRI is a standard interface that allows Kubernetes to communicate with different container runtimes in a pluggable way, without being tightly coupled to one specific runtime.
- (Use Case) Enable Runtime Flexibility: Lets Kubernetes switch between container runtimes (like Docker to containerd) without changing Kubernetes components
- (Advantage) Encourages Ecosystem Innovation: CRI allows runtime developers to innovate independently while still being Kubernetes-compatible
- Most Popular CRI Implementations:
- containerd: A lightweight, industry-standard container runtime created by Docker and donated to CNCF; it is now the default runtime in many Kubernetes distributions like GKE and EKS
- CRI-O: A Kubernetes-native runtime developed by the CNCF community to support Open Container Initiative (OCI) standards; used in OpenShift and other enterprise Kubernetes platforms
- Docker (via dockershim): Previously used in Kubernetes but deprecated since version 1.20; Docker required a compatibility shim (dockershim) because it didn’t implement CRI natively (A shim is a small software layer that connects Kubernetes to a container runtime that does not natively support the Container Runtime Interface)
What is CNI? #
- Networking: Network plugins provide essential features for Kubernetes clusters:
- Assign IP to Each Pod: Every Pod gets a unique IP so it can communicate like a real machine
- Enable Pod-to-Pod Communication: Supports communication between Pods, even if they're on different Nodes
- Support Service Discovery: Works with CoreDNS and kube-proxy to route traffic to the right Pod using service names
- Allow External Access: Enables outbound internet access and supports exposing Pods using LoadBalancers, NodePorts, or Ingress
- Control Network Policies: Implements rules that define which Pods can talk to which other Pods
- CNI: Container Network Interface — a standard interface between container runtimes and network plugins
- Implementations: Flannel, Calico, Cilium, Weave, etc
What is CSI? #
- Understanding Need For Persistent Storage: Data inside a Pod is lost when the Pod is restarted, deleted, or moved to another node
- What is Persistent Storage: Allows you to retain databases, logs, and user files in persistent storage to avoid data loss
- Needed for Stateful Apps: Required for apps like MySQL, PostgreSQL, and Kafka
- What is CSI (Container Storage Interface): A standard plugin interface that enables Kubernetes to work with different storage systems (cloud or on-prem) in a consistent and pluggable way
- Decouples Storage from Kubernetes Core: Keeps Kubernetes lean and modular by offloading storage integration to CSI-compliant drivers
- Integrate External Storage Systems Easily: Works with AWS EBS, GCP PD, Azure Disk, NFS, etc. — no need to modify Kubernetes core
- AWS EBS CSI Driver: Manages Amazon EBS volumes for Kubernetes
- AWS EFS CSI Driver: Manages AWS Elastic File System for Kubernetes
- Azure Disk CSI Driver: Manages Azure Disk volumes for Kubernetes
- GCP Persistent Disk CSI Driver: Manages Google Compute Engine Persistent Disk volumes
- NFS CSI Driver: Provides NFS storage to Kubernetes
- Enables Backups and Restore: Supports snapshots, cloning, and volume resizing
How do you setup Highly available Kubernetes cluster? #
- Prevent Single Point of Failure: A highly available (HA) Kubernetes setup ensures the cluster stays functional even if some nodes or components fail
- Use Multiple Control Plane Nodes: Deploy 3 or more master nodes (API Server, Controller Manager, Scheduler) across availability zones
- Configure an External Load Balancer: Direct traffic to healthy API servers across master nodes using an external load balancer
- Deploy ETCD in HA Mode: Use an odd number (usually 3 or 5) of etcd members
- Distribute Worker Nodes Across Zones: Spread worker nodes across failure domains to maintain app availability during node failures
- Enable Persistent Volumes in Multiple Zones: Ensure storage classes use multi-zone or regional disks for resilient storage
- Secure Communication Between Components: Use TLS certificates, RBAC, and encryption at rest for reliable and secure communication
- Monitor and Backup Regularly: Use observability tools and take regular etcd backups to recover quickly from outages or data loss
List Some Important Controllers in Kubernetes #
- Node Controller: Detects and responds when nodes go down or become unreachable
- ReplicaSet: Maintains a stable set of replica Pods running at all times
- Deployment Controller: Manages rolling updates and rollbacks
- StatefulSet Controller: Manages stateful applications with stable network identities and persistent storage
- DaemonSet Controller: Ensures a copy of a Pod runs on all (or selected) nodes
- Service Controller: Manages service objects and configures load balancers
- Horizontal Pod AutoScaler Controller: Automatically adjusts the number of Pods based on CPU/memory usage or custom metrics
- Namespace Controller: Manages cleanup and lifecycle of Kubernetes namespaces
- Ingress Controller: Manages external HTTP(S) traffic and routes it to services based on rules (though not part of core K8s, it's a commonly used controller)
What is a Operator? #
- Understanding Need for Application Lifecycle Automation: Kubernetes manages deployments and scaling, but struggles with day-2 operations like backups, version upgrades, and recovery for complex applications like databases (MongoDB, for example) or messages systems (Kafka, for example)
- What are Kubernetes Operators: Kubernetes Operators extend the platform by automating the full lifecycle of an application — install, upgrade, failure recovery, and more
- Bring Human Operational Knowledge into Kubernetes: Just like a human operator knows how to upgrade or recover a database, a Kubernetes Operator does the same using code
- Manage Stateful Applications: Operators are perfect for managing apps that need persistent storage, or other special features (e.g., Kafka, MongoDB)
- Automate Upgrades and Patching: Helps apply rolling updates or version upgrades without downtime
- Simplify Backups and Disaster Recovery: Automatically schedule backups and restore from backups when failures happen
- Popular Real-World Operators:
- Prometheus Operator – for monitoring stack
- MongoDB Operator – for automated DB provisioning
- Kafka Operator – for Kafka clusters in Kubernetes
- Example: Setting up MongoDB using MongoDB Operator in Kubernetes
- Step 1: Install the MongoDB Kubernetes Operator (
kubectl apply -k github.com/mongodb/mongodb-kubernetes-operator/config/default
)
- Step 2: Create a namespace (
kubectl create namespace mongodb
)
- Step 3: Deploy a MongoDB Custom Resource using Custom Resource Definition (CRD) (
kubectl apply -f mongodb.yaml -n mongodb
)
apiVersion: mongodbcommunity.mongodb.com/v1
kind: MongoDBCommunity
metadata:
name: my-mongodb
namespace: mongodb
spec:
members: 3
type: ReplicaSet
version: "6.0.6"
Controllers vs Operators #
Aspect |
Controllers |
Operators |
Purpose |
Controllers in Kubernetes continuously monitor and adjust built-in resources like pods and deployments to match the desired state |
Operators extend Kubernetes by adding intelligence to manage complex apps like databases, messaging systems and monitoring tools |
What They Manage |
Native Kubernetes resources like Pods and Deployments |
Complex, stateful apps like MongoDB, Kafka or Prometheus |
How They Are Built |
Built-in to Kubernetes platform |
Custom-built using Kubernetes APIs |
Level of Intelligence |
Simple reconciliation: create/update/delete pods |
Full lifecycle: install, configure, upgrade, backup |
Use of Custom Resources |
Do not require Custom Resource Definitions |
Depend on CRDs to define app-specific objects |
Examples |
ReplicaSet & Deployment Controller |
Prometheus Operator, MongoDB Operator |
What is the need for Helm? #
- Understanding Need For Kubernetes Package Management: Managing complex applications with many YAML files (Deployments, Services, ConfigMaps, Secrets) is error-prone
- What is Helm: A package manager for Kubernetes that bundles multiple related YAML files into a single reusable unit called a Helm chart
- Simplifies Installation and Upgrades: Lets you install or upgrade an entire app (like MySQL or Prometheus) with a single command
- Promotes Reusability and Standardization: Helm charts can be reused across dev, QA, and production environments — just change a few values
- Reduces Duplication with Templates: Helm uses templates and variable substitution to avoid copying the same YAML again and again
- Widely Adopted for Ecosystem Apps: Most popular tools in the cloud-native world (like ArgoCD, Grafana, Redis) offer official Helm charts for quick and consistent installs
apiVersion: apps/v1
kind: Deployment
metadata:
name: { { .Values.app.name } }
labels:
app: { { .Values.app.name } }
spec:
replicas: { { .Values.app.replicaCount } }
selector:
matchLabels:
app: { { .Values.app.name } }
template:
metadata:
labels:
app: { { .Values.app.name } }
spec:
containers:
- name: { { .Values.app.name } }
image: "{ { .Values.app.image } }:{ { .Values.app.tag } }"
ports:
- containerPort: { { .Values.app.port } }
resources:
limits:
memory: { { .Values.app.resources.memory } }
cpu: { { .Values.app.resources.cpu } }
---
app:
name: myapp
image: nginx
tag: latest
port: 80
replicaCount: 2
resources:
memory: "128Mi"
cpu: "500m"
---
apiVersion: v2
name: myapp
description: A simple Helm chart for Kubernetes
type: application
version: 0.1.0
appVersion: "1.0"
What is Kustomize? #
- Helm can be complex: Creating charts with templating could be too complex a task for simple applications
- BUT Customizing Kubernetes YAMLs is Needed: Repeating YAML files across environments (dev, qa, staging, prod) leads to duplication and errors
- What is Kustomize: Kustomize is a built-in Kubernetes tool that lets you customize YAMLs without modifying the original files — using overlays, patches, and configuration layers
- Simplifies Environment-Specific Deployments: Lets you reuse a base configuration and apply changes for different environments (e.g., different replica counts, image tags, config maps)
- Avoids Duplication with Overlays: Uses a layering system — base config stays the same, and only differences are applied as overlays (defined in patch files)
- No Templating Language Required: Works directly on YAML files — no new logic to learn
- Built Into kubectl: Comes bundled with
kubectl
— no extra installation required
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 2
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-container
image: nginx:1.21
ports:
- containerPort: 80
envFrom:
- secretRef:
name: app-secret
apiVersion: v1
kind: Service
metadata:
name: my-app-service
spec:
selector:
app: my-app
ports:
- port: 80
targetPort: 80
resources:
- deployment.yaml
- service.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 1
USERNAME=devuser
PASSWORD=devpass123
resources:
- ../../base
patchesStrategicMerge:
- replica-patch.yaml
secretGenerator:
- name: app-secret
envs:
- secret.env
type: Opaque
Compare Kustomize vs Helm #
Aspect |
Kustomize |
Helm |
What It Solves |
Customize Kubernetes YAMLs for different environments without duplication |
Package, install, and manage complex Kubernetes apps |
How It Works |
Overlays and patches on base YAML files |
Templates and reusable charts with values substitution |
Learning Curve |
Easier — works directly with YAML |
Moderate — uses Go templating, values and charts |
How to Run? |
Native to kubectl — no extra tool needed |
Needs helm CLI |
Templating Features |
No templating — strict YAML-only patches |
Rich templating using conditionals, loops, variables |
When to Use |
When you want simple customization across environments |
When you want to deploy complex apps |
How can you implement GitOps with Argo CD in Kubernetes? #
- What is Argo CD: A declarative GitOps continuous delivery tool for Kubernetes that automatically syncs cluster state with the desired state defined in Git repositories
- Step 1 – Developer Commits to Git: A developer makes changes to Kubernetes manifests (or Helm/Kustomize configs) and commits them to a Git repository
- Step 2 – Git Repository is Watched by Argo CD: Argo CD continuously watches the configured Git repository and the specified path for changes
- Step 3 – Argo CD Detects Git Changes: When changes are detected (e.g., new deployment version), Argo CD compares the desired state from Git with the live state in the cluster
- Step 4 – Argo CD Shows Out-of-Sync Status: If differences are found, Argo CD marks the application as “OutOfSync” and displays the diff via its CLI or Web UI
- Step 5 – Sync Triggered Automatically or Manually: Based on the synchronization policy defined:
- If set to auto-sync, Argo CD applies the new manifests immediately
- If set to manual sync, a user must click "Sync" or use CLI to apply the change
- Step 6 – Kubernetes Resources are Applied: Argo CD uses
kubectl
-like behavior to apply the updated manifests to the target cluster
- Step 7 – Reconciliation and Health Checks: Argo CD continuously monitors the live state and performs health checks on resources (e.g., waiting for deployments to become available)
- Step 8 – Application Marked as Healthy and Synced: Once all changes are successfully applied and resources are healthy, Argo CD marks the app as “Synced” and “Healthy”
- Step 9 – Audit Trail is Maintained in Git: Since all changes go through Git, the full history of deployments, rollbacks, and configuration changes is stored in the Git log
- Step 10 – Optional Rollback if Needed: If issues occur, the user can rollback to a previous Git commit and let Argo CD sync the cluster back to that known good state