DevOps Fundamentals

What is DevOps? #

DevOps

What is DevOps?

Scenario: Developers build features. Operations manage stability. But they work in silos → delays, finger-pointing. Can we build and run software together, better?
DevOps: A culture and set of practices that bring Development and Operations teams together to deliver software faster, safer, and continuously.

No Strict Definition

Amazon Web Services(AWS): "DevOps is the combination of cultural philosophies, practices, and tools that increases an organization’s ability to deliver applications and services at high velocity"
A Survey of DevOps Concepts and Challenges - L Leite: "DevOps is a collaborative and multidisciplinary effort within an organization to automate continuous delivery of new so�ware versions, while guaranteeing their correctness and reliability"
The DevOps Handbook - Gene Kim , Patrick Debois, Et al.: "DevOps is the outcome of applying the most trusted principles from the domain of physical manufacturing and leadership to the IT value stream. DevOps relies on bodies of knowledge from Lean, Theory of Constraints, the Toyota Production System, resilience engineering, learning organizations, safety culture, human factors, and many others. The result is world-class reliability, stability, and security at ever lower cost and effort; and accelerated flow and reliability through the technology value stream, including Product Management, Development, QA, IT Operations, and Infosec."

DevOps is about Collaboration

Dev + Ops = DevOps: Teams work together across the software lifecycle
Shared Goals: Build fast, release often, stay stable
Cultural Shift: Break down walls between teams

DevOps is about Quick Feedback

Business can’t wait: Waiting weeks or months for feedback delays product improvements
Defects shouldn't wait: Developers should be notified about code quality issues or unit test failures immediately
Continuous Improvement: Quick feedback loops help refine features and fix things faster

DevOps is about Automation

Manual = Slow + Error-Prone: Repeating steps by hand increases chances of mistakes
Not Scalable: What works for 1 server won’t work for 100
Automation Brings:
- Speed – Do things quickly
- Reliability – Same steps, every time (less chance of errors)

Key DevOps Practices

Version Control: All code (infra + app) tracked in Git
CI (Continuous Integration): Check code quality, run unit tests, .. immediately after code is committed
CD (Continuous Delivery/Deployment): Automatically deploy code to you environment
Infrastructure as Code (IaC): Use code to create your resources
Observability: Understand what's happening inside a system (Metrics + Logs + Traces)

What is Continuous Integration (CI) and Continuous Deployment (CD)? #

Continuous Integration (CI):
- Developers merge code changes often
- Automated systems build and test code every time
- Bugs are caught early
Continuous Deployment (CD):
- Code that passes all tests is deployed automatically
- No need to wait for a big release day
Continuous Delivery: Code is always ready to go live, but someone approves it
Continuous Deployment: Code goes live automatically after passing tests
General Tools: GitHub Actions, Jenkins, Argo CD
Cloud: AWS CodePipeline, AWS CodeBuild, AWS CodeDeploy, Azure DevOps,Google Cloud Build, Google Cloud Deploy

What is Infrastructure as Code (IaC)? #

Why Infrastructure as Code (IaC)?

Scenario: Imagine manually setting up infrastructure - servers, databases and networks manually each time you create a new environment - slow, inconsistent, and error-prone. Can we automate and repeat this reliably?
IaC: Write code to provision and configure infrastructure – just like you write code for applications.

What is Infrastructure as Code?

Definition: A practice of creating and configuring infrastructure (like servers, networks, databases) using machine-readable definition files.
Core Idea: Treat infrastructure setup just like application code – store it in Git, version it, review it, and automate its deployment.
Goal:
- Automate environment setup
- Eliminate manual steps
- Ensure identical environments in Dev, QA, and Prod
Types:
- Infrastructure Provisioning – to create infra
- Configuration Management – to install and manage software on that infra

1: Infrastructure Provisioning

What It Does: Creates cloud resources such as networks, virtual machines, storage buckets, load balancers, and databases using code
Tools: Terraform (multi-cloud using HCL), Pulumi (multi-cloud using programming), AWS CloudFormation, Azure Bicep, Google Cloud Deployment Manager
Benefits: Create complete environment in minutes with one command

2: Configuration Management

What It Does: Automates installation and configuration of software on servers (Install software, set timezone, update OS, configure app settings)
Tools: Ansible, Chef, Puppet
Benefits: Apply consistent configuration across 10s, 100s, or 1000s of servers
IaC = Provisioning + Configuration
Automate everything – from servers to software
Repeatable and scalable infrastructure – just like code

What is Standardization? How do containers and container orchestration enable Standardization? #

What is Standardization?

Scenario: Imagine a scenario where you are using different processes for different applications built in different programming languages? Can we create a consistent process everywhere?
Standardization: Creating uniform processes, tools, and environments for all environments to ensure apps are deployed and run the same way across all stages – development, testing, and production.

Traditional deployment

How Containers Enable Standardization

Self-Contained Units: Containers package the app + dependencies + config together (Contain everything that an application needs to run!)
Same Image Everywhere: Dev, QA, Prod – run the same container
Portable: Runs on any system with a container runtime

Example

You build a Java app with specific versions of Java + libraries
Package it in a Docker container
Run the exact same container on your laptop, test server, and cloud
Result: Works the same everywhere → standardized deployment
What's more: You can use the same process for Python or NodeJs Applications as well. Build a container image and deploy where ever you want.

What is Container Orchestration?

Scenario: You have dozens of containers running your app – across multiple servers. How do you scale them automatically? How do you find out if one of the containers is failing?
Kubernetes: An open-source platform that orchestrates (manages) containers – it automates deployment, scaling, and healing of containerized applications.

Why Container Orchestration/Kubernetes?

Manual Scaling is Hard: Kubernetes scales apps automatically
Resilience Needed: Restarts crashed containers
Run Anywhere: Works on local, on-prem, and all major clouds

Containers + Container Orchestration

Containers = Standard App Format
Orchestration = Standard Runtime & Operations
Together, they ensure consistency, reliability, and efficiency across the software delivery lifecycle.

What is Observability? #

Why Observability?

Scenario: Your app is running slowly or returning errors. But you don’t know where or why. You need visibility across all systems.
Observability: The ability to understand what’s happening inside your system just by looking at external outputs like logs, metrics, and traces.

3 Pillars of Observability

1. Logs

What They Are: Text records of events (e.g., errors, warnings)
Use: Helps answer what happened
Example: “PaymentService failed: connection timeout”

2. Metrics

What They Are: Numeric values tracked over time
Use: Helps answer how is the system performing
Example: “CPU usage = 80%”, “Latency = 220ms”

3. Traces

What They Are: The journey of a request through multiple services
Use: Helps answer where is the slowdown or failure
Example: Trace shows 2.5s delay in OrderService during checkout

Observability vs Monitoring

Monitoring = Alerts for known issues
Observability = Deep visibility to explore unknown issues

Remember

Observability = Deep Insight into System Behavior
Helps teams build faster, detect earlier, and fix smarter
Tools/Services:
- OpenTelemetry: Standard way to collect all three pillars
- Metrics: Prometheus (open-source, pull-based), Grafana (visualization), CloudWatch (AWS), Azure Monitor, Google Cloud Monitoring
- Logging: ELK Stack (Elasticsearch, Logstash, Kibana), Loki (Grafana), AWS CloudWatch Logs, Azure Log Analytics, Google Cloud Logging
- Tracing: Jaeger (open-source), Zipkin (lightweight), Grafana Tempo, AWS X-Ray, Azure Application Insights, Google Cloud Trace

Can DevOps Be Done Without Cloud? #

Can DevOps Be Done Without Cloud?

Yes – DevOps is Cloud-Friendly, Not Cloud-Dependent: DevOps is a culture + process, not tied to where your infrastructure runs. You can do DevOps with or without cloud.