A Ralph Wiggum Loop is an unsupervised agentic coding loop you can run with any frontier model. It was originally designed with Claude Code CLI in mind, and that's what this starter kit uses. (Though you can adapt this model for other agentic coding CLIs, see footnote.)
Neither a framework nor an agent, a Ralph Wiggum Loop is a bash script and prompt template that uses a specific way of structuring engineering projects so an agent can execute autonomously, one task at a time, with a fresh context window each iteration.
The best way to think about a Ralph Wiggum Loop is its like 50 First Dates if Drew Barrymore had been a software engineer. Every time the loop runs, the engineer wakes up with no memory. It only has a prompt and two markdown files to tell it what to do today. When it finishes its task, it goes to bed, only to wake up with amnesia anew the next morning (loop). (Maybe it should be called "50 First Loops?")
So for each iteration, Claude Code:
- Reads a spec.md and implementation-plan.md files
- Picks the highest-priority unchecked task
- Implements the task
- Marks it done in implementation-plan.md
- exits
Watch this explainer for the best breakdown of how it works and how it helps "keep your agent smart" by clearning out the context window on every iteration.
- Not token-efficient. Each iteration re-reads the spec from scratch. Parallel loops multiply usage exponentially.
- Quality vs. attention tradeoff. You're trading some code quality for reduced oversight.
- Big specs cause context rot. Keep spec + plan as brief as possible. If a task needs too much context, break it down.
- One bad test poisons future loops. Red/green TDD mitigates this, but doesn't eliminate it.
- Speccing is hard. If you don't know exactly what you want, try exploration mode first — spend 5 minutes brain-dumping, let Claude write a rough plan, run Ralph overnight, and use the output to inform a proper spec.
This starter combines Test Driven Development (TDD) with Claude Code in headless mode (using the -p flag). On each iteration, Claude Code:
- Reads a spec.md and implementation-plan.md files
- Picks the highest-priority unchecked task
- Writes a failing test first (red)
- Implements until it passes (green)
- Runs the test suite
- Corrects any issues
- Marks the task done in implementation-plan.md
- exits
- Changes are git-committed by the script, not the agent, to make it easier to see what the agent implemented and rollback things, without giving the agent too much control
On the next iteration, Claude Code starts with a fresh context window, no memory of the previous loop, and reads the spec again. Instead cramming everything into one long session where performance degrades (the "dumb zone" after ~100k tokens), each iteration gets a clean slate with lots of memory for solving one problem at a time. The spec and implementation plan are the source of truth, not previous context.
When Claude can't complete a task:
- First attempt fails → rolls back with
git checkout, adds a⚠️note explaining what went wrong - Second and third attempts → same: rollback and another note
- After 3 failures → Claude skips the task and moves on
- Cleanup task (second-to-last) → retries stuck tasks one more time, writes
STUCK.mdif they still fail - README task (last) → documents the project as-built, noting incomplete features
The script also stops if all remaining tasks have 3+ failures, since looping further would waste tokens.
All configuration is via environment variables:
| Variable | Default | Description |
|---|---|---|
ON_LIMIT |
overnight |
Rate limit behaviour: overnight, wait, or stop |
WAKE_HOUR |
10 |
Hour (24h format) after which overnight mode stops |
ALLOWED_TOOLS |
(Node.js defaults) | Tools Claude is allowed to use (see below) |
- macOS or Linux
- Git — for auto-commits after each task
- Node.js 22+ —
brew install node - Claude Code —
npm install -g @anthropic-ai/claude-code - A Claude plan — Max ($100-200/month) for daily use, Pro ($20/month) for overnight exploration runs
Clone this repo into your project (or just copy ralph.sh and prompt.md):
git clone https://github.com/nearestnabors/ralph-wiggum-loop-starter.git
cd ralph-wiggum-loop-starterPlanning is everything. If you try to skimp out on this step, you're going to get slop. Erors cascade across every iteration.
So get a hot cup of caffeine and put on some lofi focus beats. You're going to be talking to Claude for awhile!
In the same folder as prompt.md and ralph.sh, open Claude Code in your terminal of choice:
claudeHave a conversation where you and Claude ask each other questions until you're both aligned:
I want to build [describe your project]. Before we start, I need you to:
1. Ask me clarifying questions about anything that's ambiguous
2. Tell me what assumptions you're making so I can confirm or correct them
3. Once we're aligned, write a spec.md outlining the tech stack and specifics and implementation-plan.md, listing the tasks to be done. Be thorough and as descriptive as possible, like you're giving instructions to a team of junior engineers who can be trusted to execute but not to strategize. Make note of anything we're NOT doing, specifically. Externalize internal assumptions!
Remember to end any task list with:
- [ ] Revisit any stuck/unfinished tasks above. If still stuck, write STUCK.md describing what was tried and why it failed, then mark this task as done so you can move on
- [ ] Write/update README.md with setup instructions, dependencies, configuration, and how to run the projectGo through multiple rounds. Get another cup of caffeine. Don't rush this. I like to follow up in a fresh Claude Code conversation with:
Review spec.md and implmentation-plan.md with a critical eye. Do they still need work? Are they missing anything? List your concerns and confidence level.This will leave you with two new files:
| File | What it does |
|---|---|
spec.md |
Describes your project: architecture, tech stack, conventions, directory structure. Claude reads this at the start of every iteration. |
implementation-plan.md |
A checklist of tasks using - [ ] markdown checkboxes. Claude picks one per iteration, does it, and checks it off. |
After the conversation, Claude produces spec.md and implementation-plan.md. The implementation plan should be a checklist:
# Implementation Plan
- [ ] Set up project structure and dependencies
- [ ] Create database schema for users table
- [ ] ...
- [ ] Revisit any stuck/unfinished tasks above. If still stuck, write STUCK.md describing what was tried and why it failed, then mark this task as done
- [ ] Write/update README.md with setup instructions, dependencies, configuration, and how to run the projectThe last two tasks are important:
- the cleanup task retries anything that got stuck and writes a report
- the README task documents whatever actually got built
Read both documents completely. If you don't understand or disagree with something, edit it and/or talk it out with Claude now. Once Ralph starts running, each iteration builds on the previous one, compounding assumptions and bad ideas.
Edit the "Project Context" section in prompt.md to match your project's language, test framework, and conventions. See example/prompt.md for a template.
You can also prompt Claude:
Update the Project Context section of prompt.md to match what we agreed to in spec.md.Visit skills.sh and make sure you install the skills necessary to perform according to spec.md.
Reference any available MCP tools in prompt.md so the agent knows they're there.
git init && git add -A && git commit -m "initial plan"chmod +x ralph.sh
./ralph.shWatch the first 2–3 iterations. You're looking for:
- Is it picking tasks in a sensible order?
- Is it following the spec's conventions?
- Are the tests meaningful (not just
expect(true).toBe(true))? - Is it staying focused on one task per iteration?
If it goes off track: Ctrl+C, edit spec.md or implementation-plan.md, and restart.
| File | Purpose |
|---|---|
ralph.sh |
The loop script |
spec.md |
Architecture, conventions, constraints (you write this) |
implementation-plan.md |
Task checklist (you write this) |
prompt.md |
Instructions for Claude each iteration (customise from example) |
ralph-logs/ |
Timestamped logs from each iteration (auto-created) |
DONE |
Created by Claude when all tasks are complete |
STUCK.md |
Report on tasks that couldn't be completed |
Claude's usage limits reset on a rolling 5-hour window from your last session, not at a fixed time. The script detects rate limits and responds based on the ON_LIMIT setting:
| Mode | Behaviour |
|---|---|
overnight (default) |
Sleeps until reset if it's before your WAKE_HOUR. Stops if the reset would be after, so your morning tokens are preserved. |
wait |
Always sleeps until reset, no matter what time. |
stop |
Exits immediately. |
Update these variables in ralph.sh to reflect your usage:
WAKE_HOUR=8 # stop if reset is after 8am
ON_LIMIT=wait # always sleep until reset# tmux keeps it running if your terminal disconnects
brew install tmux
tmux new -s ralph
./ralph.sh
# Detach: Ctrl+B, then D
# Reattach: tmux attach -t ralphgit log --oneline # see what Ralph did
npm test # run the tests yourself
git show <hash> # review a specific iteration
git revert <hash> # undo a bad iteration
git bisect start # find which iteration broke somethingIf STUCK.md exists, read it to learn what couldn't be completed and why.
There's nothing Claude-specific about this pattern. The core loop is just a bash script calling a CLI tool with -p But this example is tightly coupled to Claude Code because of the claude -p command and --allowedTools flag syntax.
Other agentic coding CLIs that support a similar headless mode:
- Codex CLI (OpenAI)
codex exec "prompt"with--sandboxflags - Gemini CLI (Google) has a non-interactive mode
- aider
aider --message "prompt"for one-shot runs - Cursor CLI more IDE-oriented but has some headless capability
The loop would work with any of these in principle if you swap the claude -p line for the equivalent command. But each has different flag syntax for tool permissions, different rate limit messages, and different model selection, etc. Fork this repo and share it with them with this prompt:
Rewrite `ralph.sh` and `example/*` to work with this CLI.Each claude -p invocation can target a different model with --model. This is optional, and Sonnet is a solid default.
| Model | Flag | Best for |
|---|---|---|
| Haiku 4.5 | --model claude-haiku-4-5-20251001 |
Fast implementation, simple tasks |
| Sonnet 4.5 | --model claude-sonnet-4-5-20250929 |
General-purpose workhorse |
| Opus 4.6 | --model claude-opus-4-6 |
Complex reasoning, review, architecture |
Claude Code also has a built-in opusplan mode (--model opusplan) that uses Opus for planning and Sonnet for execution, a good middle ground.
To use a specific model, either edit the claude -p line in ralph.sh or set ANTHROPIC_MODEL in your environment:
ANTHROPIC_MODEL=claude-haiku-4-5-20251001If you make a version that works well for your CLI or add some features that make it work even better, feel free to submit a PR!
- Thanks to Roman for his epic explanation that started me on this journey.
- Thanks to Simon Willison for the red/green TDD and "first run the tests" patterns from his Agentic Engineering Patterns. (Mandatory reading if you get into this way of coding.)
- And thanks to Geoffrey Huntley for creating the idea of a Ralph loop in the first place
MIT license!