beprodready
Build Your Own Database

Stage 4: Compaction

stage 4 of 4 · ~30 min · runs in your browser

Flush enough and you drown in segments: every read scans more files, and deleted data still occupies disk under tombstones. Compaction is the background job that merges segments — keeping only each key's newest value and dropping tombstones once nothing older can resurrect the key. It's the "tree" in LSM-tree, and tuning it is half of operating RocksDB or Cassandra.

Implement store.compact():

  • merge all segments into a single sorted segment (newest value per key wins)
  • drop tombstones entirely — after a full compaction nothing older exists, so the marker is no longer needed
  • store.segments ends as [mergedSegment] (or [] if everything was deleted)
  • reads must behave identically before and after

tests (5)

many segments become one

newest value per key wins the merge

tombstones are garbage-collected

the merged segment is sorted

reads work identically after compaction

createStore.js