Stage 3: SSTables -- Flushing to Disk
stage 3 of 4 · ~30 min · runs in your browser
Objective:Flush the memtable to sorted, immutable on-disk segments and layer reads across them.
A memtable can't grow forever. When it fills, LSM engines flush it to disk as an SSTable -- an immutable, sorted run of key-value pairs. Reads then check the memtable first, then SSTables newest-to-oldest: the most recent write always wins.
Extend your store with:
store.flush()-- moves the current memtable intostore.segmentsas{ entries: [[key, value], ...] }sorted by key, then empties the memtable. Deletes flush too, as[key, null]tombstones (you need them later -- an older segment may still hold the key).get(key)-- memtable first; on miss, scansegmentsfrom newest to oldest; a tombstone (null) means "deleted", stop looking.store.segments-- newest segment last in the array.
Why sorted? Binary search. An SSTable in production is a file on disk; finding a key in O(log n) requires the file to be sorted. You'll use a sorted entries array to model this. In LevelDB/RocksDB, the SSTable also has a sparse index and Bloom filter at the end -- keys pointing to blocks -- but the sort is the primitive.
Why immutable? Two reasons: (1) sequential writes are fast, in-place updates are not; (2) multiple readers can read the same SSTable file concurrently without locks. Immutability is a core correctness and performance property.
Why this matters in production
The memtable can't grow forever -- it's RAM. When it fills, LSM engines flush it to an SSTable (Sorted String Table): a sorted, immutable file on disk. From this point forward, reads must check ALL layers: memtable first (RAM, newest), then segments from newest to oldest. This multi-layer read path is the defining characteristic of LSM engines and the root cause of both their read amplification problem and their write advantage. After Stage 3 your engine has the same fundamental architecture as LevelDB's L0 flush -- you're missing only the index (Bloom filter + sparse index) that LevelDB adds to avoid scanning every SSTable byte by byte.
tests (6)
○ flush produces a sorted immutable segment
○ flush clears the memtable
○ reads fall through to segments after flush
○ the memtable beats older segments
○ newer segments beat older segments
○ tombstones flush and hide older values