Git & Version Control // Megha Bose

Prerequisite:

Command Line & Unix Philosophy

Overview

Git is a distributed version control system that models history as a directed acyclic graph of immutable, content-addressed commits. Every commit points to its parent(s), a tree of blobs, an author, and a timestamp - and is identified by the SHA-1 hash of all that content. Branching is nearly free: a branch is just a named pointer to a commit. Merging reconciles divergent histories; rebasing replays commits onto a new base for a linear history.

Problem it solves: Coordinating changes from multiple developers without overwriting each other’s work, and tracking every change with author, timestamp, and message for auditing and rollback.
Alternatives: Mercurial (similar model, different UX); SVN/Perforce (centralised, better for large binary assets); Fossil (includes issue tracker and wiki).
Pros: Universally adopted; powerful branching model; GitHub/GitLab ecosystem; works offline; SHA-based integrity verification.
Cons: History rewriting (rebase, amend) causes problems in shared branches; large binary files bloat repos (use Git LFS); steep learning curve for conflict resolution.

The Git Data Model

Git stores everything as objects in .git/objects/:

Blob - the raw content of a file (no filename, no metadata)
Tree - a directory listing mapping names to blob or tree SHA-1 hashes
Commit - a pointer to a root tree, a pointer to parent commit(s), author/committer metadata, and a message

This content-addressable structure means identical content is stored once across the entire history, and any corruption is detectable immediately.

The three zones of a Git project:

working tree  →  staging area (index)  →  repository (.git/)
    edit             git add                 git commit

Core Commands

git init                    # create a new repo in current directory
git clone https://...       # copy a remote repo locally
git status                  # show modified, staged, untracked files
git add file.py             # stage a specific file
git add -p                  # interactively stage chunks (highly recommended)
git commit -m "feat: add user auth"
git push origin main        # push local commits to remote
git pull                    # fetch + merge (or fetch + rebase with --rebase)
git fetch origin            # download remote commits without merging

Branching and Merging

git branch feature/login    # create branch
git switch feature/login    # switch to it (modern alias for checkout)
git switch -c feature/login # create and switch in one command

git merge feature/login     # merge into current branch
git rebase main             # replay current branch commits on top of main

Fast-forward merge happens when the target branch has no new commits since the branch point - Git just moves the pointer forward, no merge commit needed.

3-way merge is used when both branches have diverged - Git finds the common ancestor and combines the diffs, producing a merge commit with two parents.

Rebase vs merge: rebase rewrites commit history for a linear graph, making git log easier to read; merge preserves the true history of when and where work was done. Never rebase commits already pushed to a shared branch.

Merge Conflicts

When the same lines are changed in both branches, Git cannot auto-resolve:

<<<<<<< HEAD
return user.email.lower()
=======
return user.email.strip().lower()
>>>>>>> feature/login

Edit the file to the correct version, remove the markers, then:

git add file.py
git commit   # complete the merge

Stash, Tags, and Log

git stash               # save dirty working tree without committing
git stash pop           # restore most recent stash
git stash list          # see all stashes

git tag v1.0.0          # lightweight tag
git tag -a v1.0.0 -m "Release 1.0.0"  # annotated tag (preferred)
git push origin --tags

git log --oneline --graph --all   # visual branch history
git log --author="Megha" --since="2 weeks ago"

The `.gitignore` File

List patterns for files Git should never track:

# Python
__pycache__/
*.pyc
.venv/

# Environment
.env
*.env.local

# Build output
dist/
build/

Undo Operations

# Undo the last commit but keep changes staged
git reset --soft HEAD~1

# Undo the last commit, unstage the changes (default)
git reset --mixed HEAD~1

# Undo the last commit and discard the changes entirely
git reset --hard HEAD~1

# Safely undo a commit by creating a new revert commit (safe for shared branches)
git revert HEAD

# Discard changes to a specific file in the working tree
git restore file.py

The reflog is your safety net - it records every position HEAD has been at:

git reflog          # see the history of HEAD movements
git reset --hard HEAD@{3}  # jump back to 3 moves ago

Common Workflows

Feature branch workflow (most common):

git switch main && git pull
git switch -c feature/new-thing
# ... make commits ...
git push -u origin feature/new-thing
# open a pull request, get review, merge to main

Gitflow adds dedicated develop, release/*, and hotfix/* branches - good for software with versioned releases.

Trunk-based development keeps branches short-lived (hours to a day) and merges frequently to main, relying on feature flags for incomplete work. Favoured by high-velocity teams with strong CI.

Examples

Typical feature branch session:

git switch main && git pull origin main
git switch -c feat/user-export

# ... write code ...
git add -p                         # stage only relevant chunks
git commit -m "feat: export users to CSV"

git push -u origin feat/user-export
# open PR → review → merge
git switch main && git pull
git branch -d feat/user-export     # clean up local branch

Fixing a bad commit with interactive rebase:

git log --oneline
# a1b2c3d feat: add export
# e4f5a6b fix: remove debug print   <-- squash this into the feat commit
# 9d8e7f6 feat: user profile page

git rebase -i HEAD~2
# In editor: change 'pick' to 'squash' on the fix commit
# Write the merged commit message, save, done

Cherry-pick - apply a specific commit to another branch:

git switch main
git cherry-pick a1b2c3d    # apply just that commit's changes to main

Understanding Git’s data model - objects, trees, and the reflog - makes it far less mysterious. Branches are cheap pointers; commits are immutable; the reflog means almost nothing is truly lost. Build these mental models and the commands follow naturally.

Read Next:

Networking: IP, TCP, HTTP