TL;DR
- Git is a content‑addressed database. Everything (files, folders, commits, tags) is an object addressed by a hash (SHA‑1 in most repos; SHA‑256 in newer ones).
- A commit points to a tree (snapshot of the repo) plus parent commit(s) and metadata (author, message). Trees point to blobs (file contents) and other trees (subfolders).
- Branches/HEAD are just pointers (refs) to commits. Moving pointers (fast‑forward, reset, rebase) changes history view without changing content—until you write new commits.
- The three areas explain every “weird” Git moment: HEAD (last committed snapshot), index (staging area), working tree (your files).
- “Dangerous” commands are predictable when you think in objects: merge creates a commit with 2 parents, rebase copies commits on top of another base, cherry‑pick copies one commit, reset moves pointers and optionally index/working tree.
The mental model (60 seconds)
commit (hash: c3) ──┐
tree ─────────────┼─▶ blobs/trees (your files at that commit)
parent: c2 └─▶ metadata (author, message, timestamp)
c1 ──▶ c2 ──▶ c3 (a simple linear history)
^
ref "main" points here (e.g., c3); HEAD points to "main"
- A commit is a Merkle node: its hash depends on its tree hash and parent hash(es). Change any file → new tree and new commit hash.
- Refs (e.g.,
refs/heads/main) are tiny text files containing a hash.HEADis usually a symbolic ref pointing at a branch (ref: refs/heads/main).
The objects: blobs, trees, commits (and tags)
| Type | What it stores | You can peek with |
|---|---|---|
| blob | Raw file content (no filename) | git cat-file -p <blob> |
| tree | Directory entries (name, mode, type, hash) | git ls-tree <tree> |
| commit | Pointer to a tree + parent(s) + metadata | git cat-file -p <commit> |
| tag | (Annotated tag) message + signature + target object | git cat-file -p <tag> |
Explore your repo
git rev-parse HEAD # the current commit id
git cat-file -p HEAD # see commit: tree + parent + message
git ls-tree -r --name-only HEAD # list files in the commit snapshot
Objects live under
.git/objects/(loose) or in packfiles (.git/objects/pack/*.pack) after GC/clone to save space.
The three areas (a.k.a. why “it didn’t commit!”)
HEAD (last commit) ↔ INDEX (staged) ↔ WORKING TREE (files)
git checkout git add edit files
- Working tree: your current files.
- Index/staging area: what will go into the next commit.
- HEAD: what’s in the current commit.
Common operations:
git add: copy changes from working tree → index.git commit: write an object for the index’s tree, then a commit pointing to it.git restore --staged path: remove from index (keep file edited).git restore path: restore file from HEAD to working tree.
What a commit really contains
Example (abridged):
$ git cat-file -p HEAD
tree a9f3e2...
parent 7c1b6d...
author You <[email protected]> 1693920000 +0200
committer You <[email protected]> 1693920000 +0200
Add feature X
Now inspect the tree:
$ git ls-tree a9f3e2
100644 blob 3b18e5 README.md
040000 tree 8c2a1a src
Trees list modes (file/exec/subtree), types, and names.
Branches & HEAD (moving pointers)
- A branch is just a ref pointing to a commit. New commits advance the branch pointer.
- Fast‑forward merge: move the pointer forward to the other commit (no new commit).
- Merge commit: create a new commit with two parents when histories diverged.
Before merge:
main: A ── B
feature: ╲
C
After merge commit M:
A ── B ── M
╲ ╲
C (M has parents B and C)
Detached HEAD:
git checkout <commit> # HEAD now points directly to a commit, not a branch
Create a branch to keep work:
git switch -c experiment
Rebase vs merge vs cherry‑pick (copying commits)
- Merge: keeps both histories; one new commit ties them together.
- Rebase: copy commits onto a new base (new hashes). Linear history, new commit ids.
- Cherry‑pick: copy one commit onto the current branch.
Rebase (concept):
feature: C1 ─ C2 main: A ─ B ─ D
│ ▲
git rebase main │ becomes │
▼ │
feature: C1' ─ C2' (same changes, new bases → new hashes)
Because rebase rewrites history, you usually need
git push --force-with-leaseto update the remote safely.
git reset demystified (what moves where)
| Command | Moves branch? | Moves index? | Moves working tree? | Use it for |
|---|---|---|---|---|
| git reset --soft X | ✅ to X | ❌ | ❌ | “Undo last commit but keep changes staged” |
| git reset --mixed X (default) | ✅ | ✅ (to X) | ❌ | “Unstage changes; keep edits” |
| git reset --hard X | ✅ | ✅ | ✅ | Discard everything and go to X |
X can be a commit, branch, or HEAD~1.
Merges & conflicts (index stages)
When Git can’t auto‑merge, files enter conflict with index stages:
- Stage 1: base version
- Stage 2:
--ours(current) - Stage 3:
--theirs(incoming)
Resolve, then:
git add <file>
git commit # completes the merge
Plumbing: build a commit by hand (tiny demo)
echo "hello" > hello.txt
git hash-object -w hello.txt # writes a blob, prints its id
git update-index --add hello.txt # stage it (index)
tree_id=$(git write-tree) # write tree from index
parent=$(git rev-parse --verify HEAD 2>/dev/null || true)
echo "first commit" | git commit-tree "$tree_id" ${parent:+-p $parent}
# Output is a commit id; update a branch to point to it:
git update-ref refs/heads/main <that-commit-id>
This is what git add + git commit automate for you.
Where space savings come from
- Git stores snapshots, but uses delta compression in packfiles, so repeated content across versions is efficient.
git gcrepacks loose objects;git count-objects -vHshows size.- Renames are detected heuristically; identical content → identical blob hash.
Safety nets: reflog & friends
- Reflog records where refs/HEAD pointed recently. If you lost a branch by resetting/force‑pushing locally:
git reflog
git checkout -b rescue <old-hash-from-reflog>
git fsck --lost-foundcan locate orphaned objects.- Many commands accept the
@{-1}syntax (previous branch).
Quick mapping: commands → data structure moves
| You run… | Under the hood it… |
|---|---|
| git add . | updates index entries (staged snapshot) |
| git commit | writes a tree from index; writes a commit pointing to that tree (+ parent) |
| git merge | computes new tree; writes a merge commit with 2 parents |
| git rebase | copies commits, writing new ones on a new base; moves branch ref |
| git checkout/switch | moves HEAD (and updates index/working tree to match) |
| git tag -a v1 | writes a tag object pointing at a commit + message/signature |
Quick checklist (become unflappable)
- [ ] Think snapshots, not diffs: commits point to trees.
- [ ] Remember the three areas: HEAD ↔ index ↔ working tree.
- [ ] Branches are pointers; moving them is cheap and reversible (reflog).
- [ ] Rebase/cherry‑pick copy commits (new ids). Merge adds a tie‑point.
- [ ] Use
git cat-file,ls-tree,reflogto debug anything scary.
One‑minute hands‑on plan
- Run
git cat-file -p HEADandgit ls-tree -r --name-only HEAD. - Make a tiny change; inspect
git diff(working tree) →git add→git diff --cached(index). - Commit; run
git log --graph --oneline --decorateto see pointers move. - Try
git switch -c demo, add a commit, thengit rebase mainto watch hashes change. - Create a conflict on purpose; resolve and complete the merge; inspect
.git/MERGE_HEADand stages.