One line: A local embedded memory system that separates ingestion, storage, retrieval, routing, scheduling, provenance, constraints, and observability into inspectable layers.
- Maturity: experimental scaffold
- Production use: no; this is a low-latency local architecture prototype
earth-database is a small memory substrate for systems that need predictable local reads and writes before they need distributed services. It uses SQLite in WAL mode as the source of truth, FTS5 for exact/provenance-first retrieval, JSONL event logs for observability, and an explicit scheduler table for slow background work.
The design keeps the hot path narrow:
- Validate input and constraints.
- Hash content and capture provenance.
- Write canonical memory rows in SQLite.
- Update local FTS.
- Enqueue derived work for later.
Embeddings, summaries, compaction, vector indexes, and policy learning are slow-path jobs. They can improve retrieval later, but they do not own the canonical record.
- Not a cloud memory platform.
- Not a finished assistant.
- Not a vector-first RAG stack.
- Not a benchmarked production database.
- Not a replacement for
memory-dropbox; this is a lower-latency embedded sibling.
python -m venv .venv
. .venv/bin/activate
pip install -e .
python -m pytest
python -m earth_database demoThe demo creates a local SQLite database and JSONL trace under a temporary directory, ingests one record, retrieves it through FTS, and prints queued background jobs.
flowchart LR
Client[Client_or_CLI] --> Ingestion[Ingestion_Layer]
Ingestion --> Constraints[Constraint_Checks]
Constraints --> Storage[(SQLite_WAL_Core)]
Ingestion --> Observability[JSONL_Event_Log]
Storage --> Retrieval[Retrieval_Layer]
Retrieval --> Router[Read_Only_Router]
Router --> Client
Storage --> Scheduler[Scheduler_Layer]
Scheduler --> Workers[Background_Workers]
Workers --> DerivedIndexes[Derived_Indexes]
Workers --> Observability
DerivedIndexes --> Retrieval
ingestion.pyvalidates and writes canonical memory without doing slow enrichment.storage.pyowns SQLite schema, WAL setup, transactions, canonical rows, FTS, provenance, events, and jobs.retrieval.pyperforms exact/provenance-first lookup with optional FTS ranking.routing.pyselects retrieval strategy from query shape and constraints without mutating storage.scheduler.pyenqueues, claims, completes, and fails idempotent background jobs.provenance.pycaptures hashes, source lineage, and runtime provenance.observability.pywrites typed JSONL events.constraints.pycentralizes explicit limits and allowed operations.
The first implementation assumes one local process or a small set of local tools sharing one SQLite file. SQLite runs in WAL mode, retrieval uses indexed tables and FTS5, and background jobs are explicit records that can be processed by a worker later.
Latency-sensitive code should not compute embeddings, call LLMs, compact history, or update learned routing weights. Those jobs are scheduled and observable.
PYTHONPATH=src python3 -m unittest discover -s tests
python -m earth_database --helpIf you install dev tooling, the same tests are also pytest-compatible.
MIT, unless this folder is moved into a repository with a different declared license.