Stage 1 of your AI build pipeline MIT · open source

In 2026, building software is trivial.
Defending it is hard.

Worth Building runs your next idea through a rigorous Socratic pressure-test — the same one a YC partner or skeptical board member would apply — and returns one of four verdicts with the reasoning, inside the AI coding agent you already use.

claude-code · worth-building
qualify ai scribe for physical therapy clinics
↳ Socratic intake · research · 10 qualification + 7 defensibility dims
Verdict: Validate first — qualification 62/100, defensibility 54/100.
↳ Designed $500 / 2-week landing-page test. First action this week: 3 named clinic owners by Fri.
Works with Claude Code · Cursor · Codex · Gemini CLI · Windsurf. No hosted service, no signup, your ideas stay on your disk.
Four outcomes, never a shrug

Every idea ends in one decisive call.

No maybes. No "interesting, keep exploring." Each verdict comes with explicit reasoning and one concrete action you can take this week.

Build now

Real wedge, defensible moat, team that can actually win. Go.

Validate first

Promising — but one assumption needs a cheap experiment. The tool designs it.

Park

Interesting — but the conditions aren't there yet. Archive with learnings.

No-go

Structural flaw. Save yourself the quarter.

What you actually get

Not a report. A working decision file.

Every qualified idea lands on your disk as a folder of Markdown — citeable, editable, portable across AI tools.

A one-page decision memo

decision.md with the verdict, explicit reasoning, red flags, and the one concrete action your team can take before Friday.

A designed experiment

For Validate-first verdicts: a riskiest-assumption test with pre-committed pass/fail numbers, typically ~$500 / 2 weeks. Cheaper than a week of building.

A portable AI handoff bundle

Run scripts/export-bundle and get a self-contained folder any agent — Claude, ChatGPT, Cursor, Gemini, Codex — can pick up and execute without re-qualifying.

How it works

A Socratic wizard that hunts for the fatal flaw.

Defaults to skepticism until the evidence earns praise. Scores team fit as a first-class dimension — same idea scores differently for different teams.

Intake

Socratic wizard, one question at a time. Problem · buyer · wedge · moat.

Research

Only what changes the decision. Competitors, demand signals, sources cited.

Score

10 qualification + 7 defensibility dimensions, each with reasoning.

Council · borderline only

6 personas evaluate independently, peer-rank each other blind. Chairman synthesizes.

Export

Verdict + tool-agnostic handoff bundle ready for any downstream AI.

The rubric, not the vibe

17 dimensions. Two gates. One honest verdict.

Build-now bar: qualification ≥ 70, defensibility ≥ 60, no blockers in team fit, capital, validation cost, platform dependency, or regulatory readiness.

10

Qualification dimensions

Is this worth pursuing now? Max 100.

Problem severity & urgency Buyer clarity, budget, switching Beachhead & distribution realism Competitive advantage & moat path Workflow embedding & retention Revenue quality & capital efficiency Founder/team right to win Timing & market pull Validation speed & cost Dependency / regulatory / ops complexity
7

Defensibility dimensions

What becomes hard to copy after launch? Max 100.

Proprietary data & context moats Workflow embedding & switching costs Trust, compliance & governance Vertical integration & niche depth Network effects & community Iteration & learning velocity Hardware integration

Bring your own AI coding agent.

Worth Building is just Markdown rules any agent reading AGENTS.md can execute.

Claude Code Cursor Codex CLI Gemini CLI Windsurf Aider Cline
30 minutes now · 3 months saved

Stop building the wrong thing.

If you're about to commit three months to an idea, spend thirty minutes running /qualify first.