← all hypothesesForecast QA Gate for Niche Research Publishers
graduated [B] filter 11.5/15 spread ±1.0 signals: 2 independent
What is this?
A pre-publication quality gate for niche research publishers, independent analysts, and paid newsletter operators who publish claims about markets, companies, policy, or technology with outcomes that resolve in days to weeks. Authors submit a draft note plus cited sources; the system extracts concrete claims, pressure-tests them using AE's six-pattern autopsy taxonomy, and forces explicit resolution criteria, time windows, and transmission assumptions before publication. After release, claims are automatically graded against public evidence on the declared schedule, creating a living accuracy ledger for each publication and analyst. This is not generic editing or RAG: the product's value is turning fuzzy analyst prose into auditable, reality-graded claims and showing where failures came from via the taxonomy. Buyers are those who monetize trust directly through subscriptions or premium research and can use a public or client-shared track record as a commercial asset. The loop is fast enough when focused on short-horizon claims, and the output compounds into differentiated credibility rather than one-off copy polish.
Why did we consider it?
A forecast QA gate for niche research publishers is a credible wedge because it converts analyst trust into a measurable asset, fits a growing publishing market, and offers a differentiated capability generic AI tools do not: pre-publication falsifiability plus post-publication reality grading.
What breaks?
- Incentive misalignment: Analysts monetize confident narratives and selective memory; an objective 'accuracy ledger' is a commercial liability that exposes them to subscriber churn.
- Market priorities: Publishers are fighting existential threats from AI scraping and paywall fatigue (e.g., Nieman Lab, NZZ), making epistemic rigor a distant non-priority.
- Commander-market mismatch: Overcoming the massive ego-driven objections of pundits requires high-touch, aggressive sales, contradicting the solo, introverted, part-time Commander profile.
What did we learn?
Engine verdict: GATHER_MORE_SIGNAL (WORTH_SKIMMING). Differentiated trust infrastructure, but no real buyer proof and draft-sharing/accountability aversion make this premature.
Filter scores
Five axes, each scored 0-3. Three independent runs by different model perspectives. Median shown.
| Axis | What it measures |
|---|
| data moat | Does this product accumulate proprietary data that compounds? |
| 10x model test | Does a better model make this more valuable, or redundant? |
| fast feedback loops | Can outputs be graded against reality in <30 days? |
| solo founder feasible | Can a solo operator build and run this without a team? |
| AI providers cant eat it | Do hyperscalers have structural reasons NOT to build this? |
Composite median: 11.5 / 15. Graduation threshold: 9.0. IQR across runs: 1.0.
Evidence
Signal B — Competitor with documented gap
Existing forecasting competitors focus on forecast accuracy comparisons after the fact, not on pre-publication claim extraction, forced resolution criteria, transmission assumptions, and a living per-analyst accuracy ledger for newsletter/research publishing workflows.
Signal D — Demand proxy
{"summary":"There are indirect signs that paid research/newsletter buyers care about credibility and evaluation: a Reddit user reports spending $9,600/year across 23 paid investing newsletters and not knowing which are worth the money, while adjacent communities are actively benchmarking model/forecast performance.","sources":["https://www.reddit.com/r/ValueInvesting/comments/1rg7muf/i_spent_9600year_on_substack_newsletters_so_you.json","https://www.reddit.com/r/LocalLLaMA/comments/1raa7jm/we_benchmarked_9_llm_models_for_stock_direction/"]}
Evaluation history
| When | Stage | Phase |
|---|
| 2026-04-19 17:43 | deep_council_verdict | graduated |
| 2026-04-19 17:36 | deep_claude_take | graduated |
| 2026-04-19 17:34 | deep_90day_plan | graduated |
| 2026-04-19 17:19 | deep_risk | graduated |
| 2026-04-19 17:10 | deep_distribution | graduated |
| 2026-04-19 16:44 | deep_pricing | graduated |
| 2026-04-19 16:24 | deep_moat | graduated |
| 2026-04-19 16:19 | deep_buyer_sim | graduated |
| 2026-04-19 16:12 | deep_icp | graduated |
| 2026-04-19 16:03 | deep_competitor | graduated |
| 2026-04-19 15:54 | deep_market_reality | graduated |
| 2026-04-19 15:30 | filter_score | scored |
| 2026-04-19 15:20 | filter_score | scored |
| 2026-04-19 15:10 | filter_score | scored |
| 2026-04-19 15:00 | evidence_search | evidence_hunt |
| 2026-04-19 14:50 | evidence_search | argument |
| 2026-04-19 14:40 | audience_simulation | argument |
| 2026-04-19 14:30 | red_team_kill | argument |
| 2026-04-19 14:20 | steelman | argument |
| 2026-04-19 14:10 | genesis | argument |