What is CRI? #


  • Understanding the Need for CRI (Container Runtime Interface): Kubernetes needs to run containers, but there are many container runtimes (Docker, containerd, CRI-O). Kubernetes needed a consistent way to talk to any container runtime.
  • What is CRI: CRI is a standard interface that allows Kubernetes to communicate with different container runtimes in a pluggable way, without being tightly coupled to one specific runtime.
  • (Use Case) Enable Runtime Flexibility: Lets Kubernetes switch between container runtimes (like Docker to containerd) without changing Kubernetes components
  • (Advantage) Encourages Ecosystem Innovation: CRI allows runtime developers to innovate independently while still being Kubernetes-compatible
  • Most Popular CRI Implementations:
    • containerd: A lightweight, industry-standard container runtime created by Docker and donated to CNCF; it is now the default runtime in many Kubernetes distributions like GKE and EKS
    • CRI-O: A Kubernetes-native runtime developed by the CNCF community to support Open Container Initiative (OCI) standards; used in OpenShift and other enterprise Kubernetes platforms
    • Docker (via dockershim): Previously used in Kubernetes but deprecated since version 1.20; Docker required a compatibility shim (dockershim) because it didn’t implement CRI natively (A shim is a small software layer that connects Kubernetes to a container runtime that does not natively support the Container Runtime Interface)

What is CNI? #


  • Networking: Network plugins provide essential features for Kubernetes clusters:
    • Assign IP to Each Pod: Every Pod gets a unique IP so it can communicate like a real machine
    • Enable Pod-to-Pod Communication: Supports communication between Pods, even if they're on different Nodes
    • Support Service Discovery: Works with CoreDNS and kube-proxy to route traffic to the right Pod using service names
    • Allow External Access: Enables outbound internet access and supports exposing Pods using LoadBalancers, NodePorts, or Ingress
    • Control Network Policies: Implements rules that define which Pods can talk to which other Pods
  • CNI: Container Network Interface — a standard interface between container runtimes and network plugins
    • Implementations: Flannel, Calico, Cilium, Weave, etc

What is CSI? #


  • Understanding Need For Persistent Storage: Data inside a Pod is lost when the Pod is restarted, deleted, or moved to another node
  • What is Persistent Storage: Allows you to retain databases, logs, and user files in persistent storage to avoid data loss
  • Needed for Stateful Apps: Required for apps like MySQL, PostgreSQL, and Kafka
  • What is CSI (Container Storage Interface): A standard plugin interface that enables Kubernetes to work with different storage systems (cloud or on-prem) in a consistent and pluggable way
  • Decouples Storage from Kubernetes Core: Keeps Kubernetes lean and modular by offloading storage integration to CSI-compliant drivers
  • Integrate External Storage Systems Easily: Works with AWS EBS, GCP PD, Azure Disk, NFS, etc. — no need to modify Kubernetes core
    • AWS EBS CSI Driver: Manages Amazon EBS volumes for Kubernetes
    • AWS EFS CSI Driver: Manages AWS Elastic File System for Kubernetes
    • Azure Disk CSI Driver: Manages Azure Disk volumes for Kubernetes
    • GCP Persistent Disk CSI Driver: Manages Google Compute Engine Persistent Disk volumes
    • NFS CSI Driver: Provides NFS storage to Kubernetes
  • Enables Backups and Restore: Supports snapshots, cloning, and volume resizing

How do you setup Highly available Kubernetes cluster? #


  • Prevent Single Point of Failure: A highly available (HA) Kubernetes setup ensures the cluster stays functional even if some nodes or components fail
  • Use Multiple Control Plane Nodes: Deploy 3 or more master nodes (API Server, Controller Manager, Scheduler) across availability zones
  • Configure an External Load Balancer: Direct traffic to healthy API servers across master nodes using an external load balancer
  • Deploy ETCD in HA Mode: Use an odd number (usually 3 or 5) of etcd members
  • Distribute Worker Nodes Across Zones: Spread worker nodes across failure domains to maintain app availability during node failures
  • Enable Persistent Volumes in Multiple Zones: Ensure storage classes use multi-zone or regional disks for resilient storage
  • Secure Communication Between Components: Use TLS certificates, RBAC, and encryption at rest for reliable and secure communication
  • Monitor and Backup Regularly: Use observability tools and take regular etcd backups to recover quickly from outages or data loss

List Some Important Controllers in Kubernetes #


  • Node Controller: Detects and responds when nodes go down or become unreachable
  • ReplicaSet: Maintains a stable set of replica Pods running at all times
  • Deployment Controller: Manages rolling updates and rollbacks
  • StatefulSet Controller: Manages stateful applications with stable network identities and persistent storage
  • DaemonSet Controller: Ensures a copy of a Pod runs on all (or selected) nodes
  • Service Controller: Manages service objects and configures load balancers
  • Horizontal Pod AutoScaler Controller: Automatically adjusts the number of Pods based on CPU/memory usage or custom metrics
  • Namespace Controller: Manages cleanup and lifecycle of Kubernetes namespaces
  • Ingress Controller: Manages external HTTP(S) traffic and routes it to services based on rules (though not part of core K8s, it's a commonly used controller)

What is a Operator? #


  • Understanding Need for Application Lifecycle Automation: Kubernetes manages deployments and scaling, but struggles with day-2 operations like backups, version upgrades, and recovery for complex applications like databases (MongoDB, for example) or messages systems (Kafka, for example)
  • What are Kubernetes Operators: Kubernetes Operators extend the platform by automating the full lifecycle of an application — install, upgrade, failure recovery, and more
  • Bring Human Operational Knowledge into Kubernetes: Just like a human operator knows how to upgrade or recover a database, a Kubernetes Operator does the same using code
  • Manage Stateful Applications: Operators are perfect for managing apps that need persistent storage, or other special features (e.g., Kafka, MongoDB)
  • Automate Upgrades and Patching: Helps apply rolling updates or version upgrades without downtime
  • Simplify Backups and Disaster Recovery: Automatically schedule backups and restore from backups when failures happen
  • Popular Real-World Operators:
    • Prometheus Operator – for monitoring stack
    • MongoDB Operator – for automated DB provisioning
    • Kafka Operator – for Kafka clusters in Kubernetes
  • Example: Setting up MongoDB using MongoDB Operator in Kubernetes
    • Step 1: Install the MongoDB Kubernetes Operator (kubectl apply -k github.com/mongodb/mongodb-kubernetes-operator/config/default)
    • Step 2: Create a namespace (kubectl create namespace mongodb)
    • Step 3: Deploy a MongoDB Custom Resource using Custom Resource Definition (CRD) (kubectl apply -f mongodb.yaml -n mongodb)
apiVersion: mongodbcommunity.mongodb.com/v1
kind: MongoDBCommunity
metadata:
  name: my-mongodb
  namespace: mongodb
spec:
  members: 3
  type: ReplicaSet
  version: "6.0.6"

# AND A LOT OF OTHER DETAILS FOLLOW!

Controllers vs Operators #


Aspect Controllers Operators
Purpose Controllers in Kubernetes continuously monitor and adjust built-in resources like pods and deployments to match the desired state Operators extend Kubernetes by adding intelligence to manage complex apps like databases, messaging systems and monitoring tools
What They Manage Native Kubernetes resources like Pods and Deployments Complex, stateful apps like MongoDB, Kafka or Prometheus
How They Are Built Built-in to Kubernetes platform Custom-built using Kubernetes APIs
Level of Intelligence Simple reconciliation: create/update/delete pods Full lifecycle: install, configure, upgrade, backup
Use of Custom Resources Do not require Custom Resource Definitions Depend on CRDs to define app-specific objects
Examples ReplicaSet & Deployment Controller Prometheus Operator, MongoDB Operator

What is the need for Helm? #


  • Understanding Need For Kubernetes Package Management: Managing complex applications with many YAML files (Deployments, Services, ConfigMaps, Secrets) is error-prone
  • What is Helm: A package manager for Kubernetes that bundles multiple related YAML files into a single reusable unit called a Helm chart
  • Simplifies Installation and Upgrades: Lets you install or upgrade an entire app (like MySQL or Prometheus) with a single command
  • Promotes Reusability and Standardization: Helm charts can be reused across dev, QA, and production environments — just change a few values
  • Reduces Duplication with Templates: Helm uses templates and variable substitution to avoid copying the same YAML again and again
  • Widely Adopted for Ecosystem Apps: Most popular tools in the cloud-native world (like ArgoCD, Grafana, Redis) offer official Helm charts for quick and consistent installs
# WHAT: Example Helm Template for a Kubernetes Deployment
# HOW:  Variables defined in values.yaml are injected into templates

# FILE: templates/deployment.yaml
# This file defines the Kubernetes Deployment
# using Helm template expressions like { { .Values.* } }

apiVersion: apps/v1
kind: Deployment
metadata:
  name: { { .Values.app.name } }
  labels:
    app: { { .Values.app.name } }
spec:
  replicas: { { .Values.app.replicaCount } }
  selector:
    matchLabels:
      app: { { .Values.app.name } }
  template:
    metadata:
      labels:
        app: { { .Values.app.name } }
    spec:
      containers:
      - name: { { .Values.app.name } }
        image: "{ { .Values.app.image } }:{ { .Values.app.tag } }"
        ports:
        - containerPort: { { .Values.app.port } }
        resources:
          limits:
            memory: { { .Values.app.resources.memory } }
            cpu: { { .Values.app.resources.cpu } }

---

# FILE: values.yaml
# This file provides values injected into the template
# Default values can be overridden at install time
app:
  name: myapp
  image: nginx
  tag: latest
  port: 80
  replicaCount: 2
  resources:
    memory: "128Mi"
    cpu: "500m"

---

# FILE: Chart.yaml
# Helm metadata file describing this chart
apiVersion: v2
name: myapp
description: A simple Helm chart for Kubernetes
type: application
version: 0.1.0
appVersion: "1.0"

# NOTE:
# You install this chart using:
# helm install myapp ./mychart
#
# You can override values with:
# helm install myapp ./mychart -f custom-values.yaml

What is Kustomize? #


# my-kustomize-app/
# ├── base/
# │   ├── deployment.yaml
# │   ├── service.yaml
# │   └── kustomization.yaml
# └── overlays/
#     └── dev/
#         ├── kustomization.yaml
#         ├── replica-patch.yaml
#         └── secret.env
  • Helm can be complex: Creating charts with templating could be too complex a task for simple applications
  • BUT Customizing Kubernetes YAMLs is Needed: Repeating YAML files across environments (dev, qa, staging, prod) leads to duplication and errors
  • What is Kustomize: Kustomize is a built-in Kubernetes tool that lets you customize YAMLs without modifying the original files — using overlays, patches, and configuration layers
  • Simplifies Environment-Specific Deployments: Lets you reuse a base configuration and apply changes for different environments (e.g., different replica counts, image tags, config maps)
  • Avoids Duplication with Overlays: Uses a layering system — base config stays the same, and only differences are applied as overlays (defined in patch files)
  • No Templating Language Required: Works directly on YAML files — no new logic to learn
  • Built Into kubectl: Comes bundled with kubectl — no extra installation required
# ------------------------
# FILE: base/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 2
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: my-container
        image: nginx:1.21
        ports:
        - containerPort: 80
        envFrom:
        - secretRef:
            name: app-secret
            # WHAT: Mounts the generated secret
            # WHY: Passes environment variables securely

# ------------------------
# FILE: base/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: my-app-service
spec:
  selector:
    app: my-app
  ports:
  - port: 80
    targetPort: 80

# ------------------------
# FILE: base/kustomization.yaml
resources:
  - deployment.yaml
  - service.yaml

# ------------------------
# FILE: overlays/dev/replica-patch.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 1
  # WHY: Dev needs fewer replicas than prod

# ------------------------
# FILE: overlays/dev/secret.env
# WHAT: Defines secret key-value pairs
# WHY: Used by secretGenerator to create a Secret
# WHERE: This file must be in the same folder as kustomization.yaml
USERNAME=devuser
PASSWORD=devpass123

# ------------------------
# FILE: overlays/dev/kustomization.yaml
resources:
  - ../../base

patchesStrategicMerge:
  - replica-patch.yaml

secretGenerator:
- name: app-secret
  envs:
    - secret.env
  type: Opaque
  # WHAT: Generates a Secret named `app-secret`
  # WHY: Holds credentials securely in Kubernetes
  # WHEN: Applied automatically on each `kubectl apply -k`
  # TYPE:
  # - Opaque (default): Generic key-value secret
  # - kubernetes.io/dockerconfigjson: for Docker auth
  # - kubernetes.io/tls: for TLS certs

# VARIATION:
# You can inline literals instead of a file:
# secretGenerator:
# - name: app-secret
#   literals:
#     - USERNAME=devuser
#     - PASSWORD=devpass123

Compare Kustomize vs Helm #


Aspect Kustomize Helm
What It Solves Customize Kubernetes YAMLs for different environments without duplication Package, install, and manage complex Kubernetes apps
How It Works Overlays and patches on base YAML files Templates and reusable charts with values substitution
Learning Curve Easier — works directly with YAML Moderate — uses Go templating, values and charts
How to Run? Native to kubectl — no extra tool needed Needs helm CLI
Templating Features No templating — strict YAML-only patches Rich templating using conditionals, loops, variables
When to Use When you want simple customization across environments When you want to deploy complex apps

How can you implement GitOps with Argo CD in Kubernetes? #


  • What is Argo CD: A declarative GitOps continuous delivery tool for Kubernetes that automatically syncs cluster state with the desired state defined in Git repositories
  • Step 1 – Developer Commits to Git: A developer makes changes to Kubernetes manifests (or Helm/Kustomize configs) and commits them to a Git repository
  • Step 2 – Git Repository is Watched by Argo CD: Argo CD continuously watches the configured Git repository and the specified path for changes
  • Step 3 – Argo CD Detects Git Changes: When changes are detected (e.g., new deployment version), Argo CD compares the desired state from Git with the live state in the cluster
  • Step 4 – Argo CD Shows Out-of-Sync Status: If differences are found, Argo CD marks the application as “OutOfSync” and displays the diff via its CLI or Web UI
  • Step 5 – Sync Triggered Automatically or Manually: Based on the synchronization policy defined:
    • If set to auto-sync, Argo CD applies the new manifests immediately
    • If set to manual sync, a user must click "Sync" or use CLI to apply the change
  • Step 6 – Kubernetes Resources are Applied: Argo CD uses kubectl-like behavior to apply the updated manifests to the target cluster
  • Step 7 – Reconciliation and Health Checks: Argo CD continuously monitors the live state and performs health checks on resources (e.g., waiting for deployments to become available)
  • Step 8 – Application Marked as Healthy and Synced: Once all changes are successfully applied and resources are healthy, Argo CD marks the app as “Synced” and “Healthy”
  • Step 9 – Audit Trail is Maintained in Git: Since all changes go through Git, the full history of deployments, rollbacks, and configuration changes is stored in the Git log
  • Step 10 – Optional Rollback if Needed: If issues occur, the user can rollback to a previous Git commit and let Argo CD sync the cluster back to that known good state