Git & Version Control
Prerequisite:
Overview
Git is a distributed version control system that models history as a directed acyclic graph of immutable, content-addressed commits. Every commit points to its parent(s), a tree of blobs, an author, and a timestamp - and is identified by the SHA-1 hash of all that content. Branching is nearly free: a branch is just a named pointer to a commit. Merging reconciles divergent histories; rebasing replays commits onto a new base for a linear history.
- Problem it solves: Coordinating changes from multiple developers without overwriting each other’s work, and tracking every change with author, timestamp, and message for auditing and rollback.
- Alternatives: Mercurial (similar model, different UX); SVN/Perforce (centralised, better for large binary assets); Fossil (includes issue tracker and wiki).
- Pros: Universally adopted; powerful branching model; GitHub/GitLab ecosystem; works offline; SHA-based integrity verification.
- Cons: History rewriting (rebase, amend) causes problems in shared branches; large binary files bloat repos (use Git LFS); steep learning curve for conflict resolution.
The Git Data Model
Git stores everything as objects in .git/objects/:
- Blob - the raw content of a file (no filename, no metadata)
- Tree - a directory listing mapping names to blob or tree SHA-1 hashes
- Commit - a pointer to a root tree, a pointer to parent commit(s), author/committer metadata, and a message
This content-addressable structure means identical content is stored once across the entire history, and any corruption is detectable immediately.
The three zones of a Git project:
working tree → staging area (index) → repository (.git/)
edit git add git commit
Core Commands
git init # create a new repo in current directory
git clone https://... # copy a remote repo locally
git status # show modified, staged, untracked files
git add file.py # stage a specific file
git add -p # interactively stage chunks (highly recommended)
git commit -m "feat: add user auth"
git push origin main # push local commits to remote
git pull # fetch + merge (or fetch + rebase with --rebase)
git fetch origin # download remote commits without merging
Branching and Merging
git branch feature/login # create branch
git switch feature/login # switch to it (modern alias for checkout)
git switch -c feature/login # create and switch in one command
git merge feature/login # merge into current branch
git rebase main # replay current branch commits on top of main
Fast-forward merge happens when the target branch has no new commits since the branch point - Git just moves the pointer forward, no merge commit needed.
3-way merge is used when both branches have diverged - Git finds the common ancestor and combines the diffs, producing a merge commit with two parents.
Rebase vs merge: rebase rewrites commit history for a linear graph, making git log easier to read; merge preserves the true history of when and where work was done. Never rebase commits already pushed to a shared branch.
Merge Conflicts
When the same lines are changed in both branches, Git cannot auto-resolve:
<<<<<<< HEAD
return user.email.lower()
=======
return user.email.strip().lower()
>>>>>>> feature/login
Edit the file to the correct version, remove the markers, then:
git add file.py
git commit # complete the merge
Stash, Tags, and Log
git stash # save dirty working tree without committing
git stash pop # restore most recent stash
git stash list # see all stashes
git tag v1.0.0 # lightweight tag
git tag -a v1.0.0 -m "Release 1.0.0" # annotated tag (preferred)
git push origin --tags
git log --oneline --graph --all # visual branch history
git log --author="Megha" --since="2 weeks ago"
The .gitignore File
List patterns for files Git should never track:
# Python
__pycache__/
*.pyc
.venv/
# Environment
.env
*.env.local
# Build output
dist/
build/
Undo Operations
# Undo the last commit but keep changes staged
git reset --soft HEAD~1
# Undo the last commit, unstage the changes (default)
git reset --mixed HEAD~1
# Undo the last commit and discard the changes entirely
git reset --hard HEAD~1
# Safely undo a commit by creating a new revert commit (safe for shared branches)
git revert HEAD
# Discard changes to a specific file in the working tree
git restore file.py
The reflog is your safety net - it records every position HEAD has been at:
git reflog # see the history of HEAD movements
git reset --hard HEAD@{3} # jump back to 3 moves ago
Common Workflows
Feature branch workflow (most common):
git switch main && git pull
git switch -c feature/new-thing
# ... make commits ...
git push -u origin feature/new-thing
# open a pull request, get review, merge to main
Gitflow adds dedicated develop, release/*, and hotfix/* branches - good for software with versioned releases.
Trunk-based development keeps branches short-lived (hours to a day) and merges frequently to main, relying on feature flags for incomplete work. Favoured by high-velocity teams with strong CI.
Examples
Typical feature branch session:
git switch main && git pull origin main
git switch -c feat/user-export
# ... write code ...
git add -p # stage only relevant chunks
git commit -m "feat: export users to CSV"
git push -u origin feat/user-export
# open PR → review → merge
git switch main && git pull
git branch -d feat/user-export # clean up local branch
Fixing a bad commit with interactive rebase:
git log --oneline
# a1b2c3d feat: add export
# e4f5a6b fix: remove debug print <-- squash this into the feat commit
# 9d8e7f6 feat: user profile page
git rebase -i HEAD~2
# In editor: change 'pick' to 'squash' on the fix commit
# Write the merged commit message, save, done
Cherry-pick - apply a specific commit to another branch:
git switch main
git cherry-pick a1b2c3d # apply just that commit's changes to main
Understanding Git’s data model - objects, trees, and the reflog - makes it far less mysterious. Branches are cheap pointers; commits are immutable; the reflog means almost nothing is truly lost. Build these mental models and the commands follow naturally.
Read Next: