The Object in the Machine
An excavation of Git's hidden filesystem
Git stores everything as objects named by their content. This post covers the three object types (blobs, trees, commits) and how they chain together to form Git's immutable history.
Create a repository, add a file, and commit it. Then look in .git/objects/. You’ll find a handful of files with 40-character hex names. Those are your objects.
Every object’s name is derived from its content. Run this:
$ echo "hello" | git hash-object --stdin
22596363b3de40b06f981fb85dac12e3f1c0f3d5
Run it again — same output. Pass that string to anyone on any machine anywhere and they get the same 40-character hash. The hash is the content and the content is the hash. They are the same thing expressed two ways. This is the idea everything else is built on.
So what is in .git/objects/?
What Lives in .git/objects/
Start a repository, make a file, commit it. Then look inside:
.git/
objects/
3b/18e512dba79e4c8300dd08aeb37f8e728b8dad
e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
ab/c123...
These are your objects, sharded into directories named after the first two hex characters of their hash. The remaining 38 characters are the filename. This is a content-addressable filesystem — ask for a hash, get the content back.
There are exactly three kinds of objects. A blob stores file content. A tree describes a directory. A commit records a snapshot.
Blobs
A blob stores only the bytes — not the name, not the permissions, not the line endings. Just the content.
$ echo "hello world" > README.md
$ git hash-object README.md
3b18e512dba79e4c8300dd08aeb37f8e728b8dad
Now rename it:
$ mv README.md NOTE.md
$ git hash-object NOTE.md
3b18e512dba79e4c8300dd08aeb37f8e728b8dad
Same hash. The content did not change, so the hash could not change. The name is irrelevant. A blob is pure bytes with no filename attached.
Peek at what Git actually stored:
$ git cat-file -p 3b18e512dba79e4c8300dd08aeb37f8e728b8dad
hello world
That’s it. No metadata, no filename, no timestamp. A blob is pure content.
Trees
Files live in named directories. A tree maps names to hashes:
100644 blob 3b18e51 README.md
040000 tree a1b2c34 src/
Each entry gives the Unix permissions, the type of object it points to, the hash, and the name. This is a directory.
The tree itself becomes an object. Its hash comes from its own content — the sorted list of entries. Rename a file inside a tree and the tree’s hash changes. Change a file’s content (which changes its blob hash) and the tree’s hash changes. The tree is a commitment to a particular arrangement of names and content.
You can inspect any tree:
$ git cat-file -p HEAD^{tree}
100644 blob 3b18e51 README.md
040000 tree a1b2c34 src/
Commits
A commit is a small text object that records the state of the project at a point in time.
tree a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0
parent c9d0e1f2a3b4c5d6e7f8a9b0a1b2c3d4e5f6a7b8
author Bernard <adjanour@icloud.com> 1747000000
committer Bernard <adjanour@icloud.com> 1747000000
Add documentation for the config parser
The commit points to a tree. It points to zero or more parents. It records the author, the committer, a timestamp, and a message. Hash everything together and you get the commit’s identifier.
Every commit stores references to its parent commits. Follow those backwards and you reach the root — the first commit, which has no parent.
The Chain Reaction
Every object’s hash depends on everything inside it. Objects reference each other by hash. So changing anything forces every object after it to get a new hash.
Walk through what happens when you modify a file and commit:
You change the content, so the blob hash changes. The tree that listed that blob now has an entry pointing to a hash that does not exist anymore. Update the entry and the tree content changes — new tree hash. The commit that pointed to that tree now references a tree that no longer exists. Rewrite the commit — new commit hash. The next commit in the chain pointed to this commit by its hash, and that hash just changed. The parent pointer no longer resolves. So the next commit gets rewritten too. And so on, all the way to the tip of the branch.
Git does not edit the old objects. It writes new ones alongside them. The old chain still sits in .git/objects/ — nothing in Git is ever deleted — but nothing points to it anymore.
This chain is what lets two repositories verify identical history by comparing a single 20-byte hash. If your latest commit hash matches mine, then every blob, tree, commit, parent pointer, timestamp, and message in the entire chain must match too. Change anything at any point in history and the tip hash changes.
It is also what makes tampering detectable. Rewrite an old commit and every hash after it shifts. The original tip hash no longer resolves to anything in the rewritten chain. Anyone who saved the original hash can compare and see the divergence.
Here is what a chain looks like for a repo with three commits:
$ git log --oneline
a1b2c3d Add documentation
d4e5f6a Implement config parser
g7h8i9b Initial commit
Each commit stores a pointer to its tree and its parent:
a1b2c3d → tree:xyz, parent:d4e5f6a
d4e5f6a → tree:uvw, parent:g7h8i9b
g7h8i9b → tree:rst, parent:(none)
Change something in g7h8i9b’s tree and d4e5f6a’s tree pointer no longer resolves — new tree, new commit hash. Which means a1b2c3d’s parent pointer no longer resolves — new commit hash. The whole chain shifts. This structure has a name: a Merkle DAG. Git’s history is tamper-evident by design.
Plumbing
Git’s user-friendly commands are called porcelain. The low-level commands — the ones that actually talk to the object store — are called plumbing. Once you understand the object model, the plumbing is straightforward:
| Command | What it does |
|---|---|
git hash-object | Hash a file and optionally store the blob |
git cat-file -p | Print the contents of any object |
git ls-tree | List entries in a tree |
git write-tree | Create a tree object from the staging area |
git commit-tree | Create a commit object from a tree |
git update-ref | Move a branch pointer to a different commit |
Porcelain commands call plumbing commands. git add calls hash-object and update-index. git commit calls write-tree and commit-tree. There is no magic — the higher commands are just scripts that string the low-level ones together.
You can prove this. Here is a commit built without git add or git commit:
$ echo "hello" | git hash-object -w --stdin
22596363b3de40b06f981fb85dac12e3f1c0f3d5
$ git update-index --add --cacheinfo 100644 \
22596363b3de40b06f981fb85dac12e3f1c0f3d5 hello.txt
$ tree_hash=$(git write-tree)
$ commit_hash=$(echo "my commit" | git commit-tree $tree_hash)
$ git update-ref refs/heads/main $commit_hash
$ git log --oneline
a1b2c3d my commit
No git add. No git commit. Yet the commit exists, the tree exists, the blob exists. Git does not care how the objects got there. It only cares that they are there, named by what they contain.
What This Means
Content addressing gives every object a universal name. Two people on different continents who create the same file get the same hash. The same holds for trees and commits: identical entries, identical metadata, identical hash.
Git does not store files; it stores objects. A filename is just an entry in a tree. A file’s content is its true identity. This is why Git detects renames, deduplicates identical blobs, and lets any two clones verify their entire history with a single hash.
When you commit, your files become blobs, your directory tree becomes a tree object, and a small text object — the commit — ties them all together with the tree hash, the parent hash, your name, and your message. A branch pointer moves forward one step. Nothing is ever edited; everything is added.
That’s the object model. Three types of objects, content-addressed storage, and a chain of commits tying everything together. It is not complicated, but it is the foundation everything else is built on.
Further Reading
- Pro Git Book — The official Git book. Chapters 10 and 7 cover the object model and plumbing commands in detail.
- Git Internals — Git Objects — The relevant chapter from Pro Git, focused entirely on the object store.
- Git Reference Documentation — Every command documented, including all plumbing commands.
- Think Like (a) Git — A practical guide to Git’s mental model, useful for building intuition about the DAG.
Share this post