Why is Version Control Essential? #


$ git log --oneline app.py
9e7c2f5 (Ethan) Add logging to greet()
a3d1cbe (Deepa) Fix typo in greeting
6b2e9a1 (Charlie) Refactor: move greeting to main()
f1a72bb (Bob) Add greeting function
d0a1b22 (Alice) Initial version of app.py

What is Version Control?

  • Version Control Systems (VCS): Tools that track changes to files over time, allowing developers to manage code efficiently and collaborate without conflicts
  • Purpose: Ensures that code history is maintained, and developers can work together without overwriting each others work

Why is Version Control Essential?

  • Collaboration: Multiple developers can work on the same project without overwriting each others work
  • History Tracking: Easily view, compare, and restore previous versions of code
  • Branching: Developers can experiment with new features in separate branches without affecting the main code base

How Does Git Distributed Version Control Improve Collaboration and Efficiency? #


Types of VCS:

  • Centralized Version Control Systems (CVCS): A single central server stores all versions of the code (e.g., SVN, CVS)
  • Distributed Version Control Systems (DVCS): Every developer has a complete copy of the repository (e.g., Git, Mercurial)

Comparison

Aspect Centralized VCS (CVCS) Distributed VCS (DVCS)
Repository Location Single central server Each developer has a complete copy
Offline Work (Check History,..) Limited Fully supported
Single Point of Failure Yes (central server outage) No (any copy can restore the project)
Backup Central server must be backed up Each clone serves as a backup
Performance Network-dependent Local operations are faster

Step by Step: Using a DVCS while Offline

  • (Step 1) Cloning the Repository: git clone https://github.com/in28minutes/devops-master-class.git. Automatically creates and checks out the main branch locally
  • (Step 2) Go Offline Immediately After Cloning: You can disconnect from the internet β€” You can do a LOT offline
  • (Step 3) Make a Change to a File: Edit a file like README.md or kubernetes/02-kubernetes-for-beginners.md
  • (Step 4) Stage and Commit the Change Locally: (git add README.md, git commit -m "Updated README with offline changes")
  • (Step 5) Compare With Remote: git diff (Compares your local with the last known state)
  • (Step 6) View History of a File: git log README.md, git blame README.md (Helps understand who changed what and when β€” all offline)
  • (Step 7) Switch to Another Local Branch: git checkout feature-branch-a (and play with it)
  • (Advantage) Supports A Lot of Features Offline: You can commit (locally step by step), compare, inspect, and undo changes entirely without internet
  • (Advantage) Sync Later When Online: Use git push origin main when you're back online to upload all your local changes

How Does Git’s Distributed Version Control Improve Collaboration and Efficiency?

  • Complete History for Everyone: Every developer’s local copy includes the full project history, making offline work possible
  • Includes All Branches: Local clones contain all remote branches that were fetched, allowing developers to switch, create, or merge branches offline
  • Faster Operations: Most Git commands (like commit, diff) are local, providing fast performance
  • No Single Point of Failure: If the main server goes down, any local copy can be used to restore the project

Why is Git Snapshot System a Game-Changer? #


What is Snapshot-Based Systems?

  • Snapshot Storage: Git captures the entire state of your project at each commit, like a photograph
  • Complete Versions: Instead of storing just the changes (deltas), Git stores a complete version of changed project files
  • Efficient Storage: Unchanged files are linked to previous versions rather than duplicated, saving space

How is it different from Delta-Based Systems (SVN)?

  • Delta-Based Storage: Only stores the differences between file versions
  • More Processing: Requires more computing to reconstruct previous versions
  • Complex Branching: Changes are applied sequentially, making branching more complex

An example of Snapshot Based System

  • COMMIT 1
    • File Content:
      Welcome
      Version 1
    • Git Storage: Stores a full snapshot of the file as-is
  • COMMIT 2
    • File Content:
      Welcome
      Version 2
    • Git Storage: Stores a new full snapshot of the modified file
  • COMMIT 3
    • File Content:
      Hello
      Version 3
    • Git Storage: Stores yet another full snapshot of the file (Git optimizes behind the scenes using compression and shared objects)

An example of Delta Based System

  • COMMIT 1
    • File Content:
      Welcome
      Version 1
    • SVN Storage: Stores the complete file for the initial version
  • COMMIT 2
    • Delta Stored: Change line 2 β†’ Version 2
    • SVN Storage: Only the difference from version 1 is stored
  • COMMIT 3
    • Delta Stored: Change line 1 β†’ Hello Change line 2 β†’ Version 3
    • SVN Storage: Stores only the differences (deltas) from previous version

Why is Git’s Snapshot System a Game-Changer?

  • Speed: Quickly access any commit without calculating differences
  • Reliability: Full file versions are always available
  • Simplified Merging: Merging is faster because complete versions are available

Give a Short History of Git #


History of Git

  • The Need: In 2005, the Linux kernel development team required a powerful, distributed version control system
  • The Problem: They were using a proprietary tool that was expensive, unreliable, and centralized
  • The Solution: Linus Torvalds, the creator of Linux, developed Git to solve these problems
  • Git is Popular Because of its Key Design Principles:
    • Speed: Git was designed to be fast, even for large projects
    • Security: Changes are securely tracked and verifiable
    • Decentralization: Developers can work independently, without relying on a central server
  • 2005: Git was released as an open-source project
  • 2008: GitHub launched, making Git widely accessible for collaborative projects (Git is a version control tool; GitHub is a cloud-based platform for hosting and managing Git repositories)
  • 2010s: Git adoption surged as open-source communities and companies shifted away from Subversion (SVN) and other centralized tools
  • 2018: Microsoft acquired GitHub, further boosting Git’s presence in the enterprise market
  • Today: Git is the standard for version control in software development

Key Commercial Products Built Around Git

  • GitHub: Hosted Git repositories with collaboration, issue tracking, CI/CD, and security tools
  • GitLab: Git repository hosting with integrated DevOps lifecycle tools (CI/CD, container registry, monitoring)
  • Bitbucket (by Atlassian): Git hosting with Jira and Trello integration
  • Azure Repos: Microsoft's Git repository service integrated with Azure DevOps
  • AWS CodeCommit: Git-based repository service hosted by Amazon Web Services
  • Cloud Source Repositories (CSR): Google Cloud's Git-compatible source control service for storing and managing code in private Git repositories


Important Version Control Systems Before Git

Tool Type Strengths Weaknesses Typical Use Case
CVS (Concurrent Versions System) Centralized (CVCS) Simple and lightweight Limited merge support, prone to errors in large projects Used by early open-source projects
Subversion (SVN) Centralized (CVCS) User-friendly with basic features Network-dependent for most operations, performance degrades with large repos or many branches Popular for corporate projects earlier
Perforce Centralized (CVCS) Excellent performance with large codebases and binary files (like game art and assets) Complex setup, resource-intensive Common in game development

Comparing Git vs SVN

Feature Git Subversion (SVN)
Type Distributed (DVCS) Centralized (CVCS)
Commit Model Snapshots Deltas
Offline Work Fully supported Limited
Merging Fast and efficient Can be slow
Branching Lightweight and fast Branching is slower
Used By Used By Most Enterprise and Open Source Projects Some legacy enterprise projects

What is GitHub? πŸ™πŸŒ #


  • Platform on Top of Git: GitHub is a web-based platform built on top of Git that enhances what Git can do by adding collaboration and project management features
  • Collaboration & Hosting Made Easy: Designed to host Git repositories and provide powerful tools for team collaboration, code review, and project tracking β€” think of it as Git++
  • Open Source Friendly: Widely used for hosting open-source projects and connecting with the global developer community
  • Public and Private Repositories: Choose between public repos for sharing with the world or private repos for secure, team-only collaboration
  • User-Friendly Interface for Beginners: Offers a simple and intuitive UI for those who are new to Git, making common tasks easier to perform
  • Pull Requests and Code Reviews: Enables developers to propose changes and collaborate through peer reviews before merging to main branches
  • Built-In Issue Tracking: Manage bugs, feature requests, and team tasks using GitHub’s issue system
  • Team Discussions and Knowledge Sharing: Use the Discussions tab as a dedicated space for questions, ideas, and decisions β€” all in one place
  • GitHub Actions for Automation: Set up workflows to automatically build, test, and deploy your code whenever changes are pushed
  • GitHub Pages for Web Hosting: Host static websites straight from your repository β€” perfect for portfolios, documentation, or demos

Git vs Github

Feature Git GitHub
Version Control Distributed version control for local version management Cloud-based platform for hosting Git repositories, You can make it accessible to all team members
Project Management No built-in project management tools Issue tracking, milestones, and project boards
CI/CD Integration Requires manual setup of external CI/CD tools Integrated support for GitHub Actions to automate workflows
Community No direct community interaction Community engagement through stars, forks, and discussions

What is Git Repository? #


  • Scenario: Imagine building a project where multiple developers work on the same code, often at the same time. Without a Git repository, it’s hard to track changes, avoid overwriting each other’s work, or recover from mistakes.
  • Git repository: A Git repository stores all your code and its history β€” like a time machine for your project
    • You need a separate git repository for each of your projects
  • Tracks Every Change with Context: Git records who made each change, when they made it, and why β€” enabling complete change management
  • Enables Safe Experimentation: Try new ideas locally without fear β€” if anything breaks, you can always go back to a previous version
  • Local Repository for Private Development: You work on your own machine in a fully functional Git environment, even without internet (git init)
  • Remote Repository for Team Collaboration: A shared repository helps your team work together on the same codebase (Create a repository on Github and git clone to your local machine)
  • Simple Process for Seamless Collaboration: Work locally, commit changes, and push to the remote repo β€” teammates pull updates to stay in sync

Practical Example: Create a New Git Repo and Push to GitHub

The goal of this workflow is to set up version control for your project using Git and connect it to GitHub

# STEP 1: Create a new local Git repository
git init
# WHAT: Initializes a new Git repo in current folder
# WHEN: You are starting a project from scratch

# STEP 2: Create a new file or modify files
echo "Hello Git" > hello.txt
# WHAT: Creates a file with sample content

# STEP 3: Check the status of the repo
git status
# WHAT: Shows untracked/modified/staged files
# WHY: Helps verify what will be committed

# STEP 4: Add file(s) to the staging area
git add hello.txt
# VARIATION:
# git add .   β†’ adds all modified/untracked files
# git add -A  β†’ adds all changes (incl. deleted files)

# STEP 5: Commit staged files
git commit -m "Initial commit"
# WHAT: Creates a snapshot of your staged changes
# BEST PRACTICE: Use short, clear commit messages

# STEP 6: Create a new GitHub repository
# Visit https://github.com/new
# NAME: my-first-git-repo (example)
# Keep the repo EMPTY (no README/.gitignore)

# STEP 7: Connect local repo to GitHub
git remote add origin \
https://github.com/youruser/my-first-git-repo.git
# WHAT: Adds a remote named 'origin' pointing to GitHub
# WHY: So you can push/pull code to/from GitHub

# STEP 8: Push code to GitHub
git push -u origin main
# WHAT: Pushes 'main' branch to 'origin' remote
# WHY: Publishes your local code to GitHub
# -u: Sets 'origin main' as default for future pushes
# Future pushes β†’ just run: git push

How Git Organizes and Stores Your Code #


  • Working Directory – Where Files Live and Evolve: This is your active project folder where you create and edit files. It reflects your latest code and uncommitted changes
  • Index (Staging Area) – Prepare Files for Commit: When you run git add, Git stores a snapshot of the file in the staging area, ready to be committed
  • Local Object Database – Stores Full Commit History: Once you run git commit, Git creates objects like blobs, trees, and commits and stores them inside the hidden .git/objects folder. This is your full version history.
  • Remote Object Database – Shared Copy in the Cloud: When you run git push, Git sends your commits to a remote repository like GitHub, GitLab, or Bitbucket so others can collaborate
  • Each Store Has a Specific Role: The working directory holds current files, the index stages selected changes, the local object database saves commit history, and the remote database enables sharing
  • Everything Happens Inside .git Folder: Your entire Git project metadata, commit history, references, and configuration are stored here β€” no external database needed
  • Remote Store Mirrors Local State for Collaboration: Remote repositories hold the same Git objects, enabling team members to pull and push changes seamlessly
  • Best Practice: Understanding where objects are stored helps troubleshoot issues with commits, staging, or syncing with remote

How a File Moves Through Git: From Untracked to Pushed #


  • Flow for a new file
    • Start as Untracked – File Not Known to Git Yet: When you create a new file, Git does not track it until you explicitly add it
    • Staged – Ready to Be Committed: Once added using git add, the file becomes tracked and enters Git’s control flow for changes and history. File is moved into the staging area, marking it as ready for commit.
    • Committed – Saved in Local Clone: Use git commit to take a snapshot of staged files; they are now part of your local Git history inside the .git folder
    • Pushed – Synced to Remote Repository: Use git push to send your committed changes to a remote like GitHub or GitLab for backup and collaboration
    • Easy to Visualize Flow: File states flow like this β€” Untracked β†’ Tracked and Staged β†’ Committed β†’ Pushed
  • Flow for an already committed file
    • Unmodified – No Changes Since Last Commit: The file remains in a stable state; nothing new has been changed after the last commit
    • Modified – You Made Changes, But Didn’t Stage Yet: When you edit a file, Git marks it as modified but does not include it in the next commit unless you stage it
    • Committed – Change Saved in Local Clone: Use git commit to take a snapshot of staged files; they are now part of your local Git history inside the .git folder
    • Pushed – Synced to Remote Repository: Use git push to send your committed changes to a remote like GitHub or GitLab for backup and collaboration
    • Easy to Visualize Flow: File states flow like this β€” Modified β†’ Staged β†’ Committed β†’ Pushed
  • Foundation: This lifecycle is the foundation of Git’s version control β€” it allows developers to track, manage, and share changes confidently and consistently across teams and time

What happens in the background when you commit code? (git commit) #


  • Copy Staged Changes into the Local Object Database: Git stores the content from the staging area into a special .git/objects folder β€” this is your full version history
  • Create a Tree Object for the Snapshot: Git builds a tree structure representing the exact state of your files at the time of the commit
  • Generate a Commit Object with Metadata: Git adds information like author name, timestamp, and commit message to describe what the change was about
  • Link Commit to the Snapshot (Tree): The commit object points to the tree object, creating a traceable connection between the message and the actual files
  • Assign a Unique Commit ID: Every commit is given a unique ID that can be used to identify, share, or roll back to that point
  • Chain Commits Together with a History Pointer: Each commit includes a reference to the previous commit, forming a chain β€” the full history of your project
  • Why It Matters: Commits help you track progress, roll back mistakes, and understand the evolution of your code over time
    • Git’s power comes from using commits as the foundation for structure, traceability, and control
    • Branches and tags point to commits β€” no commits, no tracking or navigation
    • Use commits to checkout, tag, revert, or explore any version