caecus-skill

A vision-capable Orchestrator reads your design. GLM 5.2 writes the code. A verify loop closes the gap.

caecus-skill is a collection of Agent Skills that turn a visual design (an image, a video, or an inspiration source) into working frontend code through a closed-loop generate-and-verify cycle.

Beta: v0.1.0-beta.6, installed from mihneaptu/caecus-skill. The current release is tracked in CHANGELOG.md.

Docs site: mihneaptu.github.io/caecus-skill. Guide, docs, and changelog in one place.

What's different

Closed-loop verification. The Orchestrator renders the result and compares it to your original design, then sends GLM design feedback until they match. Open-loop generators skip this check.
Hard role boundary. The Orchestrator sees and judges. GLM 5.2 writes every line of code. Neither crosses into the other's work.
Fidelity to the source. You measure the result against the real design you handed in.
More than images. Video and inspiration sources work too.

How it works

Design in. Hand the Orchestrator an image, video, or inspiration source.
Brief out. The Orchestrator writes a Design_Brief in plain language.
Code out. Paste the brief into GLM 5.2. It returns frontend code.
Render and compare. Run the code. The Orchestrator compares the rendered result to the original.
Iterate. If it does not match, the Orchestrator gives design-only feedback; GLM 5.2 regenerates. The loop runs until it matches or the default limit of five rounds is reached.

The rule: the Orchestrator talks about design. GLM 5.2 writes the code.

No API keys. No MCP. No CLI. You are the bridge between the two models.

Skills in the collection

The first four run on the Orchestrator. The last runs on the code-author side (GLM 5.2).

Skill	Side	What it does
`image-to-prompt`	Orchestrator	Image / screenshot / mockup → `Design_Brief`
`video-to-prompt`	Orchestrator	Video → `Design_Brief`
`inspiration-to-prompt`	Orchestrator	Inspiration source → `Design_Brief` (inspiration, not a copy)
`generate-and-verify`	Orchestrator	Brief → code → render → screenshot → verify loop
`glm-codegen`	Code author	Teaches GLM 5.2 the brief format, fidelity rules, and iteration protocol

Which skill should I use?

Start here: image-to-prompt + generate-and-verify. That is the whole loop: a still design in, verified code out.
Recreating a screen recording or a motion-heavy UI? Add video-to-prompt.
Building from a reference you admire rather than copying it 1:1? Add inspiration-to-prompt.
On the GLM 5.2 side, always load glm-codegen. It teaches GLM the brief format and keeps its output faithful. It is loaded on the code-author side, not with -s (see the FAQ).

Get started

What you need

A vision-capable Orchestrator with browser / screenshot tools. Recommended: GPT 5.5. The Orchestrator role is mostly agentic (drive a browser, render the code, screenshot it, run the feedback loop), and GPT 5.5 leads on computer use and autonomous task execution, which is that loop. Gemini 3.1 Pro is the strongest alternative: it leads on pure multimodal/vision, so it edges ahead on the visual-compare lens itself. Claude Opus 4.8 also works well and tops the general intelligence rankings. Whatever you pick must be able to view images and render/screenshot the result itself. It runs inside any agent host that loads Agent Skills (Claude Code, Cursor, Codex, and similar).
Access to GLM 5.2 via chat.z.ai, OpenCode, ZCode, OpenRouter, or a GLM Coding Plan.
Node.js 18+, so npx skills add runs.

Install the Orchestrator skills

npx skills add mihneaptu/caecus-skill -s image-to-prompt video-to-prompt inspiration-to-prompt generate-and-verify

Install only the ones you need:

npx skills add mihneaptu/caecus-skill -s image-to-prompt generate-and-verify

Set up GLM 5.2

glm-codegen is loaded on the GLM 5.2 side, not through the -s flag. Use the bundled system prompt at skills/glm-codegen/references/glm-system-prompt.md — one paste includes the skill plus fidelity and anti-tell rules.

Host	How to load `glm-codegen`
chat.z.ai	Paste `skills/glm-codegen/references/glm-system-prompt.md` as a system prompt once per session.
OpenCode / ZCode	Add `glm-system-prompt.md` as a skill in the GLM 5.2 agent config (or load `glm-codegen` and paste the bundle).
OpenRouter	Paste the bundle as system prompt, or load via your agent host's skill mechanism if available.

Orchestrator hosts (quick notes)

Host	Install skills	Tips
Cursor	`npx skills add mihneaptu/caecus-skill -s image-to-prompt generate-and-verify`	Attach the source image each round; use browser/screenshot tools for renders.
Claude Code	Same `npx skills add` command	Save the original design path early; re-open it every compare turn.
Codex	Same `npx skills add` command	Fresh chat per design keeps the visual compare honest.

Full walkthrough: site guide.

Starter prompt (copy into the Orchestrator)

I want to run the caecus pipeline on this design.

1. Use image-to-prompt on the attached image and write a complete Design_Brief.
2. I'll paste that brief to GLM 5.2; give me the brief only when it's ready.
3. When I bring back GLM's code, use generate-and-verify: render it, screenshot it,
   and compare against the original image. Feedback to GLM must be design words only.
4. Repeat until it matches or we hit 5 rounds.

Set up your workspace

Pick a folder for this run (your workspace root). Create three subfolders if they are not there yet:

your-workspace/
├─ design/        : original image + Design_Brief (Orchestrator only)
├─ glm-5.2/       : code GLM writes (you save it here verbatim)
└─ screenshots/   : renders the Orchestrator captures for compare

GLM 5.2 never sees images, only the brief and design-feedback text you paste.

Run once

Give the Orchestrator your design (save the source in design/; or use a bundled reference image below).
Paste the Design_Brief from design/ into GLM 5.2.
Save GLM's code verbatim in glm-5.2/, run it, and give the Orchestrator a screenshot (saved in screenshots/).
The Orchestrator compares the original and screenshot, then gives you design-only feedback text to paste back to GLM. Repeat until the match is good enough.

Try the bundled examples. Each ships a reference source image you can attach:

Minimal starter: three-icon-example.md + three-icon-source.jpg
Showcase (editorial landing): awwwards-landing-example.md + awwwards-landing-source.jpg (result: awwwards-landing-result.html)

Tested with

Cursor, Claude Code, Codex (Orchestrator hosts); chat.z.ai, OpenCode, ZCode, OpenRouter (GLM 5.2). Any vision-capable agent host that loads Agent Skills and can view images should work.

Security & trust

Installing a skill writes only Markdown and referenced assets. Nothing executes. No install hooks, no build step, no secrets required. The workflow is a human copying a brief and code between two chat models, so the skills themselves make no network calls and need no API keys.

FAQ

How is this different from "taste" or anti-slop design skills? Those add good taste to a single agent and stop there: open-loop, with no specific source to match. caecus reproduces a particular design you hand it and verifies the render against that original through a screenshot-and-compare loop. It holds the result to a real source you provide. You can still pair it with a taste skill on the GLM side.

Why does install use -s? Why not npx skills add mihneaptu/caecus-skill? The bare command installs every skill in the repo, including glm-codegen, which belongs on the GLM 5.2 side, not the Orchestrator. The -s flag installs only the Orchestrator skills and keeps that side clean. Load glm-codegen separately on GLM.

Does it work with React, Vue, or Svelte? The brief targets design intent, not one framework. Defaults are plain HTML/CSS for simple static designs and React + Tailwind for interactive ones; name any other stack in the brief's target_stack.

What is a SKILL.md? A portable Markdown instruction file an agent loads automatically. Install it with npx skills add, or copy it into your project or paste it into a chat.

Does anything run on my machine when I install? No. Installing writes only Markdown and referenced assets: no scripts, no build, no secrets. See Security & trust above.

Feedback

This is a beta. Reports make it better.

Give beta feedback →

Contributing

This is an early beta and a solo project. If you have feedback or want to propose a skill, open an issue. Contribution guidelines will land as the project grows.

License

The collection is licensed under Apache License 2.0. See LICENSE. The license covers the code and content. It does not grant rights to the caecus name or brand.

The caecus name & brand

The caecus name and brand are protected separately from the license. Forks and derivatives are welcome under Apache-2.0, but they must not imply they are the official project.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

caecus-skill

What's different

How it works

Skills in the collection

Which skill should I use?

Get started

What you need

Install the Orchestrator skills

Set up GLM 5.2

Orchestrator hosts (quick notes)

Starter prompt (copy into the Orchestrator)

Set up your workspace

Run once

Tested with

Security & trust

FAQ

Feedback

Contributing

License

The caecus name & brand

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github		.github
assets		assets
site		site
skills		skills
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

caecus-skill

What's different

How it works

Skills in the collection

Which skill should I use?

Get started

What you need

Install the Orchestrator skills

Set up GLM 5.2

Orchestrator hosts (quick notes)

Starter prompt (copy into the Orchestrator)

Set up your workspace

Run once

Tested with

Security & trust

FAQ

Feedback

Contributing

License

The caecus name & brand

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages