Skip to content

Embedded (Python)

The Python package puts the entire NamiDB engine in your process. It’s the same engine that ships in the Rust daemon and on Cloud — what changes is where the bytes live and where the engine runs.

When to use embedded mode

  • Notebooks and exploration. Open a namespace, query, throw it away.
  • Scripts and CI fixtures. Deterministic graphs from memory://.
  • Single-process applications. You own the bucket and you don’t want a network hop.

If your app needs a network boundary (multiple processes, a language other than Python or Rust, RBAC, auth), reach for the HTTP server instead.

Open a client

import namidb
client = namidb.Client(uri)

The URI selects the backend. Every backend speaks the same Cypher.

URIBackend
memory://<ns>In-process, ephemeral
file:///abs/dir?ns=<ns>Local filesystem
s3://<bucket>[/<prefix>]?ns=<ns>&region=<region>AWS S3
s3://<bucket>?ns=<ns>&endpoint=<url>MinIO, R2, Tigris, LocalStack
gs://<bucket>?ns=<ns>Google Cloud Storage
az://<account>/<container>?ns=<ns>Azure Blob Storage

Credentials come from the standard cloud env vars (AWS_ACCESS_KEY_ID, GOOGLE_APPLICATION_CREDENTIALS, etc.). NamiDB does not introduce its own credential surface for self-hosted use. See Storage backends for the per-backend reference.

Run a query

client.cypher(query, params=None) is the entry point. It parses, plans, executes, and returns a result set.

result = client.cypher(
"MATCH (p:Person) WHERE p.age >= $min RETURN p.name AS name, p.age AS age",
params={"min": 18},
)

Parameters

Parameters are passed as a dict to params=. They’re bound at execution time and never spliced into the query string.

client.cypher(
"CREATE (p:Person {name: $name, age: $age})",
params={"name": "Alice", "age": 30},
)

Result formats

The result object exposes the same rows in several shapes:

result = client.cypher("MATCH (p:Person) RETURN p.name AS name, p.age AS age")
result.rows() # list[dict] — easiest for small results
result.to_pandas() # pandas.DataFrame
result.to_polars() # polars.DataFrame
result.to_arrow() # pyarrow.Table — zero-copy

For large result sets, prefer to_arrow() or to_polars() over rows() — they skip per-row Python-object overhead.

Async (acypher)

The same package exposes an async surface for use inside an async runtime (FastAPI, aiohttp, Trio, asyncio):

import asyncio
import namidb
async def main():
client = namidb.Client("s3://my-bucket?ns=prod&region=us-east-1")
result = await client.acypher(
"MATCH (p:Person) RETURN count(*) AS n"
)
print(result.rows())
asyncio.run(main())

Rule of thumb: use acypher when you’re already inside an event loop, use cypher everywhere else.

End-to-end example

import namidb
# Same code works against any backend; only the URI changes.
client = namidb.Client("file:///tmp/namidb?ns=demo")
# Bulk-load with UNWIND.
client.cypher(
"UNWIND $people AS p CREATE (:Person {name: p.name, age: p.age})",
params={"people": [
{"name": "Alice", "age": 30},
{"name": "Bob", "age": 25},
{"name": "Carol", "age": 42},
]},
)
# Query and project into a pandas DataFrame.
df = client.cypher(
"MATCH (p:Person) "
"WHERE p.age >= $min "
"RETURN p.name AS name, p.age AS age "
"ORDER BY p.age DESC",
params={"min": 18},
).to_pandas()
print(df)

Kill the script and run it again — because the URI is file://, the data is still there. Open a different process pointed at the same URI and you’ll see the same graph.

Why this works without a coordinator

NamiDB uses conditional writes on the bucket (or flock plus atomic rename on file://) for compare-and-swap on the manifest. There is no external lock service, no Raft, no etcd. Two processes can race a write against the same namespace and exactly one wins; the loser sees a fenced epoch and either retries or fails. This is what makes the embedded mode safe to run from multiple replicas of your service without coordination outside the bucket.

What’s next