Methodology

How a hypothesis moves through the engine, from admission to verdict.

Pipeline

Every hypothesis passes through five phases. Each phase has stop conditions and explicit kill criteria.

1. Genesis

An LLM proposer, working from a corpus of seed concepts, drafts a candidate hypothesis. A second LLM critic reviews it for structural defects (banned audience patterns, seller-side incentive misalignment, undefined buyer, vague resolution criteria). Most candidates are rejected at this stage. Survivors are admitted to the board.

2. Argument

Three structured analyses run on the admitted hypothesis: steelman (strongest case for the product), red team kill (strongest case against), audience simulation (does a representative buyer want this).

3. Evidence hunt

An agentic web search hunts for three signal types: primary sources (regulator filings, peer-reviewed research, government data), competitor products with documented gaps, and demand proxies (forum discussions, GitHub issues, news). The hypothesis must collect at least 2 independent signals to advance.

4. Filter scoring

Two LLM advocates argue for and against on each of five filter axes. Three full runs. The median of each axis is summed for a composite score. Graduation bar: composite ≥ 9.0 / 15.

5. Verdict

A council of three frontier models reviews the full dossier and reaches a verdict (escalate / graduate / kill / need more signal). When the council cannot converge after three rounds, it escalates to Commander review.

The five filter axes

AxisQuestion it answersWhy it matters
data moatDoes this product accumulate proprietary data?Without a data moat, a better model resets the playing field every release.
10x model testDoes a better model make this MORE valuable?If yes, you are on a defensible layer. If no, you are middleware in a price war.
fast feedback loopsOutputs verifiable against reality in <30 days?Without fast grading, you cannot tell if you are improving — or wrong.
solo founder feasibleBuildable by one person without a team?Reduces capital, time-to-product, and key-person risk.
AI providers cant eat itDo hyperscalers have reason NOT to build this?If a hyperscaler can absorb your wedge in a roadmap update, you do not have one.

What this engine does NOT do