Snapshots & epoch fencing

Snapshots

Every read in NamiDB happens against a snapshot — an immutable view of the namespace at a specific manifest version. The snapshot pins:

The exact set of SSTs that were live at that version
The schema at that version
A floor LSN

You get a snapshot from a WriterSession:

let snap = writer.snapshot();
let rows = execute(&plan, &snap, &Params::new()).await?;

Or implicitly per Cypher call from the Python client:

result = client.cypher("MATCH (p:Person) RETURN p.name")
# read against an internally-captured snapshot

Multiple snapshots can coexist. A long-running analytical query and a hot write can run concurrently — the query reads from its snapshot, the writer mutates the memtable and produces a new manifest version. When the query finishes, GC can reclaim SSTs no longer referenced.

The manifest

A small JSON object at {namespace}/manifest.json that names everything currently live for the namespace. The schema (simplified):

{
  "version": 42,
  "epoch": 7,
  "lsn_watermark": 18374,
  "schema_id": "...",
  "ssts": {
    "node": { "L0": [...], "L1": [...] },
    "edge": { "L0": [...], "L1": [...] }
  }
}

Every write commit produces a new manifest version. The previous version stays addressable (it’s referenced by in-flight snapshots) until GC removes it.

Manifest CAS

Mutators do:

1. Read manifest.json, capture its ETag.
2. Build the new version locally (apply WAL segments, list new SSTs).
3. PUT manifest.json with If-Match: <captured ETag>.
4. If 412 Precondition Failed → re-read, rebuild, retry.
5. If 200 OK → broadcast new version, increment local epoch.

This is the same recipe that works for vectors and analytics on object storage. The S3 conditional write replaces the consensus tier.

Epoch fencing

Each manifest carries an epoch counter that increments on every commit. A writer that has been idle while another writer advanced the epoch will fail its next commit attempt and must re-bootstrap from the latest manifest.

This fences out stale writers — for example, a process that lost network connectivity, then reconnected, can no longer commit against the old epoch. It must re-read and re-plan its writes.

What you don’t get (yet)

Cross-namespace transactions. Each namespace is its own CAS unit. There is no two-phase commit across namespaces. Use one namespace if you need transactional consistency across data.
Reader scale beyond one process. Today, namidb-server serialises requests behind a tokio Mutex (single-writer-per-namespace lifted to the request layer). RFC-021 removes the mutex from the read path so a single daemon can fan out reads to every core.