#1 on Terminal-Bench-2 (65.2%)
64.8% More Token Efficient

The Agent for Precision
and Speed.

Dirac is an open-source coding agent that reduces API costs by 64.8% while maintaining 100% accuracy on complex, real-world tasks. Supports agents.md and custom skills for tailored workflows.

Or install via CLI

npm install -g dirac-cli
Read the CLI guide →

Stay updated on Dirac's progress. No spam.

Hash-Anchored Edit Engine
1
2
3
4
5
6
Apple§ def process_data(items):
- Brave§ total = 0
- Cider§ for item in items:
+ Bison§ return sum(item.price for item in items if item.price > 0)
Fox§ return total
// Edit applied with 100% AST precision

Built for High-Bandwidth Engineering

Traditional agents get lost in large files. Dirac uses structural understanding to stay accurate and lean.

Hash-Anchored Edits

Hash-Anchored Edits

Stable line hashes target edits with extreme precision, avoiding the "lost in translation" issues of traditional line-number based editing.

AST-Native Precision

AST-Native Precision

Built-in understanding of language syntax allows Dirac to perform complex structural refactoring with 100% accuracy.

Multi-File Batching

Multi-File Batching

Process and edit dozens of files in a single LLM roundtrip, significantly reducing latency and API costs.

High-Bandwidth Context

Optimized context curation keeps the agent lean and fast. Dirac only reads what it needs, ensuring the LLM always has the most relevant information without wasting tokens.

  • Smart file skeleton extraction
  • Recursive symbol dependency tracking
  • Token-efficient diff generation
  • Native support for agents.md and custom skills (picks up .ai, .claude, .agents)
  • Autonomous tool use: file I/O, terminal, and headless browser
  • Native tool calling only (No MCP)
Multi-File Batching

📊 Evals

Dirac is benchmarked against leading open-source agents on complex, real-world refactoring tasks. We don't just claim efficiency; we prove it.

Average Cost per Task
Traditional Agents $0.53 avg
Dirac $0.18 avg
64.8% cheaper than competition
Task (Repository) Files Cline Kilo Ohmypi Opencode Pimono Roo Dirac
Task1 (transformers) 8 🟢 $0.37 🔴 N/A 🟡 $0.24 🟢 $0.20 🟢 $0.34 🟢 $0.49 🟢 $0.13
Task2 (vscode) 21 🟢 $0.67 🟡 $0.78 🟢 $0.63 🟢 $0.40 🟢 $0.48 🟡 $0.58 🟢 $0.23
Task3 (vscode) 12 🟡 $0.42 🟢 $0.70 🟢 $0.64 🟢 $0.32 🟢 $0.25 🟡 $0.45 🟢 $0.16
Task4 (django) 14 🟢 $0.36 🟢 $0.42 🟡 $0.32 🟢 $0.24 🟡 $0.24 🟢 $0.17 🟢 $0.08
Task5 (vscode) 3 🔴 N/A 🟢 $0.71 🟢 $0.43 🟢 $0.53 🟢 $0.50 🟢 $0.36 🟢 $0.17
Task6 (transformers) 25 🟢 $0.87 🟡 $1.51 🟢 $0.94 🟢 $0.90 🟢 $0.52 🟢 $1.44 🟢 $0.34
Task7 (vscode) 13 🟡 $0.51 🟢 $0.77 🟢 $0.74 🟢 $0.67 🟡 $0.45 🟢 $1.05 🟢 $0.25
Task8 (transformers) 3 🟢 $0.25 🟢 $0.19 🟢 $0.17 🟢 $0.26 🟢 $0.23 🟢 $0.29 🟢 $0.12
Total Correct 5/8 5/8 6/8 8/8 6/8 6/8 8/8
Avg Cost $0.49 $0.73 $0.51 $0.44 $0.38 $0.60 $0.18
🟢 Success 🟡 Incomplete 🔴 Failure * Benchmarks run on public GitHub repositories using Google's gemini-3-flash-preview. Reproducible by anyone.
Note: A bug was discovered in Cline (issue #10314) causing slight underreporting of costs ($0.03 vs $0.05 per million token cache read). This affects both Dirac and Cline evals. We will update these numbers soon.

Detailed Task Descriptions

Task 1 & 5: Complex Refactoring (vscode & transformers)

Refactoring large, central coordinator services (like extensionsWorkbenchService.ts) into smaller, manageable modules. Requires 100% accuracy in dependency management and zero linter errors.

Task 2 & 8: API & Method Signature Updates

Renaming core methods (like value_from_datadict) and refactoring parameter structures across entire codebases. Requires precise call-site identification and replacement.

Task 3 & 4: Interface & Global Instrumentation

Adding mandatory methods to interfaces and implementing global logging/instrumentation across all definitions of specific commands. Requires exhaustive AST traversal.

Task 6 & 7: Feature Implementation & Telemetry

Implementing complex new logic (like Entropy-based stopping criteria) and cross-cutting concerns (latency telemetry) across deep inheritance hierarchies.

Test Methodology

  • 1

    Clean Slate: Every test starts with a git reset --hard && git clean -fd to ensure zero cross-contamination.

  • 2

    Zero Intervention: No manual guiding, nudging, or corrections. Agents are given the prompt and left to execute autonomously.

  • 3

    Standardized Config: All agents used gemini-3-flash-preview with thinking set to 'High'. Failures are only marked after at least 3 unsuccessful attempts.

  • 4

    Local Verification: All linter and syntax checks (e.g., ruff) are run using the local .venv to ensure production-grade output.

  • 5

    Leaderboard Proven: Dirac topped the Terminal-Bench-2 leaderboard with a 65.2% score, outperforming Google's baseline (47.6%) and Junie CLI (64.3%).

// All benchmarks are reproducible.
// Detailed diffs available on request.

Why Dirac?

Traditional Agents

  • Line-number based edits that break as the file changes.
  • Naive string-replace that causes syntax errors.
  • Single-file roundtrips that waste time and tokens.
  • Bloated context that degrades model reasoning.
  • Fragile MCP-based tool implementations.
Recommended

Dirac

  • Hash-anchored edits that stay stable across changes.
  • AST-native manipulation for 100% syntactic accuracy.
  • Multi-file batching for 2.8x faster execution.
  • Curated context that maximizes model intelligence.
  • Native tool calling (No MCP) for maximum reliability.
  • Supports agents.md and custom skills.

Ready for Precision?

Stop wasting tokens on bloated context and fragile edits. Experience the most efficient coding agent ever built.

Install Dirac on VS Code

Prefer the terminal?

npm install -g dirac-cli
Read the CLI guide →

Open Source. Apache 2.0. Built for developers.