🎯
BUILDING
Pinned Loading
-
-
failure-induced-benchmarks
failure-induced-benchmarks PublicThis repo studies whether benchmarks can be generated from model failure geometry.
Python
-
earth-database
earth-database PublicA local embedded memory system with SQLite/WAL, FTS5 retrieval, JSONL observability, explicit provenance, scheduler jobs, and now a trust-aware ingestion layer.
Python
-
obversary-eval-harness
obversary-eval-harness PublicLightweight Python evaluation harness for testing systems against reproducible benchmark tasks.
Python
-
obversarystudios/TSPbenchmark
obversarystudios/TSPbenchmark PublicAHA submission | TSP measures whether an assistant helps the user continue the real work, not merely answer the visible question.
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.


