Git under the hood (minimal)

2025-10-11   blogpage sketch


Git is a content-addressable, immutable, distributed database optimized for tracking filesystem snapshots.


Core ideas:

  • Everything is identified by a cryptographic hash (SHA-1 or SHA-256)

  • Data is immutable: new content ⇒ new hash ⇒ new object

  • Commits store complete tree snapshots, not diffs

  • History forms a Merkle DAG of commits


Main object types:

  • blob → raw file content

  • tree → directory (maps filenames to blob/tree hashes)

  • commit → metadata + pointer to a tree + parent commit(s)

  • tag → named reference to an object (usually a commit)


Storage layout:

  • Objects live under .git/objects/<2-char>/<38-char>

  • Loose objects are zlib-compressed individually

  • Packfiles group and delta-compress objects for efficiency


Hashes and integrity:

  • Object id = hash(content)

  • Commit includes hash(tree) and hash(parent)

  • Chain of hashes = tamper-evident Merkle DAG


References:

  • .git/refs/heads/<branch> → latest commit hash

  • .git/refs/tags/<tag> → tagged commit

  • HEAD → current branch ref (symbolic)

  • Detached HEAD → points directly to a commit


Index (staging area):

  • .git/index maps paths → blob hashes + metadata

  • Bridge between working directory and next commit

  • Enables three-way diff: working dir, index, HEAD


Graph model:

  • Commits form a DAG: node = commit, edge = parent

  • Merge = commit with multiple parents

  • Rebase = rewrite DAG by creating new commits


Remotes:

  • Remote = peer repository (not a master)

  • Fetch/push sync missing objects by comparing hashes

  • Transfer is delta-efficient and stateless


Architectural patterns:

  • Content-addressable storage → object immutability

  • Composite → trees containing blobs/subtrees

  • Merkle DAG → commit integrity and verification

  • Symbolic references → HEAD and branches

  • Staging buffer → index as write cache

  • Eventual consistency → decentralized sync


Mental model:

Git = immutable key-value store + DAG of snapshots + symbolic refs.




Incoming Internal References (0)

Outgoing Internal References (0)

Outgoing Web References (0)

Receive my updates

Barış Özmen © 2025