DSPyground

An open-source prompt optimization harness powered by GEPA. Install directly into your existing AI SDK agent repo, import your tools and prompts for 1:1 environment portability, and align agent behavior through iterative sampling and optimization—delivering an optimized prompt as your final artifact. Built for agentic loops.

Key Features

Bootstrap with a Basic Prompt — Start with any simple prompt—no complex setup required. DSPyground will help you evolve it into a production-ready system prompt.
Port Your Agent Environment — Use a simple config file to import your existing AI SDK prompts and tools—seamlessly recreate your agent environment for optimization.
Multi-Dimensional Metrics — Optimize across 5 key dimensions: Tone (communication style), Accuracy (correctness), Efficiency (tool usage), Tool Accuracy (right tools), and Guardrails (safety compliance).

Quick Start

Prerequisites

Node.js 18+
AI Gateway API key (create one in the AI Gateway dashboard)

Installation

# Using npm
npm install -g dspyground

# Or using pnpm
pnpm add -g dspyground

Setup and Start

# Initialize DSPyground in your project
npx dspyground init

# Start the dev server
npx dspyground dev

The app will open at http://localhost:3000.

Note: DSPyground bundles all required dependencies. If you already have ai and zod in your project, it will use your versions to avoid conflicts. Otherwise, it uses its bundled versions.

Configuration

Edit dspyground.config.ts to import your AI SDK tools and customize your setup:

import { tool } from 'ai'
import { z } from 'zod'
// Import your existing tools
import { myCustomTool } from './src/lib/tools'

export default {
  // Add your AI SDK tools
  tools: {
    myCustomTool,
    // or define new ones inline
  },

  // Set your system prompt
  systemPrompt: `You are a helpful assistant...`,

  // Choose your default model
  defaultModel: 'openai/gpt-4o-mini'
}

Environment Setup

Create a .env file in your project root:

AI_GATEWAY_API_KEY=your_api_key_here

This API key will be used by DSPyground to access AI models through AI Gateway. Follow the getting started guide to create your API key.

Note: All data is stored locally in .dspyground/data/ within your project. Add .dspyground/ to your .gitignore (automatically done during init).

How It Works

DSPyground follows a simple 3-step workflow:

1. Install and Port Your Agent

Install DSPyground in your repo and import your existing AI SDK tools and prompts for 1:1 environment portability. Use dspyground.config.ts to configure your agent environment.

2. Chat and Sample Trajectories

Interact with your agent and collect trajectory samples that demonstrate your desired behavior:

Start with a base prompt in .dspyground/data/prompt.md (editable in UI)
Enable Teaching Mode and chat with the AI to create scenarios
Save samples with feedback: Click the + button to save conversation turns as test samples
- Give positive feedback for good responses (these become reference examples)
- Give negative feedback for bad responses (these guide what to avoid)
Organize with Sample Groups: Create groups like "Tone Tests", "Tool Usage", "Safety Tests"

3. Optimize

Run GEPA optimization to generate a refined prompt aligned with your sampled behaviors. Click "Optimize" to start the automated prompt improvement process.

The Modified GEPA Algorithm

Our implementation extends the traditional GEPA (Genetic-Pareto Evolutionary Algorithm) with several key modifications:

Core Improvements:

Reflection-Based Scoring: Uses LLM-as-a-judge to evaluate trajectories across multiple dimensions
Multi-Metric Optimization: Tracks 5 dimensions simultaneously (tone, accuracy, efficiency, tool_accuracy, guardrails)
Dual Feedback Learning: Handles both positive examples (reference quality) and negative examples (patterns to avoid)
Configurable Metrics: Customize evaluation dimensions via data/metrics-prompt.json
Real-Time Streaming: Watch sample generation and evaluation as they happen

How It Works:

Initialization: Evaluates your seed prompt against a random batch of samples
Iteration Loop (for N rollouts):
- Select best prompt from Pareto frontier
- Sample random batch from your collected samples
- Generate trajectories using current prompt
- Evaluate each with reflection model (LLM-as-judge)
- Synthesize feedback and improve prompt
- Test improved prompt on same batch
- Accept if better; update Pareto frontier
Pareto Frontier: Maintains set of non-dominated solutions across all metrics
Best Selection: Returns prompt with highest overall score

Key Differences from Standard GEPA:

Evaluates on full conversational trajectories, not just final responses
Uses structured output (Zod schemas) for consistent metric scoring
Supports tool-calling agents with efficiency and tool accuracy metrics
Streams progress for real-time monitoring

3. Configuration

Optimization Settings (.dspyground/data/preferences.json):

optimizationModel: Model used for generating responses during optimization
reflectionModel: Model used for evaluation/judgment (should be more capable)
batchSize: Number of samples per iteration (default: 2)
numRollouts: Number of optimization iterations (default: 3)
selectedMetrics: Which dimensions to optimize for

Metrics Configuration (.dspyground/data/metrics-prompt.json):

Customize evaluation instructions and dimension descriptions
Adjust weights and criteria for each metric
Define how positive vs negative feedback is interpreted

4. Results & History

Optimized prompt saved to .dspyground/data/prompt.md
Run history stored in .dspyground/data/runs.json with:
- All candidate prompts (accepted and rejected)
- Scores and metrics for each iteration
- Sample IDs used during optimization
- Pareto frontier evolution
View in History tab: See score progression and prompt evolution

Additional Features

Structured Output Mode — Toggle between regular chat and structured output. Edit .dspyground/data/schema.json to define your output structure for data extraction, classification, and more.
Custom Tools — Import your tools in dspyground.config.ts. Works with any AI SDK tool from your existing codebase.
Sample Groups — Organize samples by use case or test category. Switch groups during optimization to test different scenarios.

Architecture

Frontend: Next.js with AI SDK (ai package)

Real-time streaming with useChat and useObject hooks
Server-sent events for optimization progress
shadcn/ui component library

Backend: Next.js API routes

/api/chat - Text and structured chat endpoints
/api/optimize - GEPA optimization with streaming progress
/api/samples, /api/runs - Data persistence
/api/metrics-prompt - Configurable metrics

Optimization Engine: TypeScript implementation

GEPA algorithm in src/app/api/optimize/route.ts
Reflection-based scoring in src/lib/metrics.ts

Local Data Files

All data is stored locally in your project:

.dspyground/data/prompt.md — Current optimized prompt
.dspyground/data/runs.json — Full optimization history with all runs
.dspyground/data/samples.json — Collected samples organized by groups
.dspyground/data/metrics-prompt.json — Configurable evaluation criteria
.dspyground/data/schema.json — JSON schema for structured output mode
.dspyground/data/preferences.json — User preferences and optimization config
dspyground.config.ts — Tools, prompts, and model configuration

Learn More

GEPA:

GEPA Optimizer — Genetic-Pareto optimization algorithm
DSPy Documentation — Prompt optimization framework
GEPA Paper — Academic research

AI SDK:

AI SDK — The AI Toolkit for TypeScript
AI SDK Docs — Streaming, tool calling, and structured output

About

Built by the team that built Langtrace AI and Zest AI.

License

Apache-2.0. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
cli		cli
data		data
public		public
scripts		scripts
src		src
templates		templates
.env.example		.env.example
.gitignore		.gitignore
.npmignore		.npmignore
LICENSE		LICENSE
README.md		README.md
components.json		components.json
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
postcss.config.mjs		postcss.config.mjs
tsconfig.cli.json		tsconfig.cli.json
tsconfig.json		tsconfig.json
tsup.config.ts		tsup.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DSPyground

Key Features

Quick Start

Prerequisites

Installation

Setup and Start

Configuration

Environment Setup

How It Works

1. Install and Port Your Agent

2. Chat and Sample Trajectories

3. Optimize

The Modified GEPA Algorithm

3. Configuration

4. Results & History

Additional Features

Architecture

Local Data Files

Learn More

About

License

About

Uh oh!

Releases

Packages

Languages

License

thorstone137/dspyground

Folders and files

Latest commit

History

Repository files navigation

DSPyground

Key Features

Quick Start

Prerequisites

Installation

Setup and Start

Configuration

Environment Setup

How It Works

1. Install and Port Your Agent

2. Chat and Sample Trajectories

3. Optimize

The Modified GEPA Algorithm

3. Configuration

4. Results & History

Additional Features

Architecture

Local Data Files

Learn More

About

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages