← all hypotheses

Forecast QA Gate for Niche Research Publishers

graduated [B] filter 11.5/15 spread ±1.0 signals: 2 independent
What is this?
A pre-publication quality gate for niche research publishers, independent analysts, and paid newsletter operators who publish claims about markets, companies, policy, or technology with outcomes that resolve in days to weeks. Authors submit a draft note plus cited sources; the system extracts concrete claims, pressure-tests them using AE's six-pattern autopsy taxonomy, and forces explicit resolution criteria, time windows, and transmission assumptions before publication. After release, claims are automatically graded against public evidence on the declared schedule, creating a living accuracy ledger for each publication and analyst. This is not generic editing or RAG: the product's value is turning fuzzy analyst prose into auditable, reality-graded claims and showing where failures came from via the taxonomy. Buyers are those who monetize trust directly through subscriptions or premium research and can use a public or client-shared track record as a commercial asset. The loop is fast enough when focused on short-horizon claims, and the output compounds into differentiated credibility rather than one-off copy polish.
Why did we consider it?
A forecast QA gate for niche research publishers is a credible wedge because it converts analyst trust into a measurable asset, fits a growing publishing market, and offers a differentiated capability generic AI tools do not: pre-publication falsifiability plus post-publication reality grading.
What breaks?
  • Incentive misalignment: Analysts monetize confident narratives and selective memory; an objective 'accuracy ledger' is a commercial liability that exposes them to subscriber churn.
  • Market priorities: Publishers are fighting existential threats from AI scraping and paywall fatigue (e.g., Nieman Lab, NZZ), making epistemic rigor a distant non-priority.
  • Commander-market mismatch: Overcoming the massive ego-driven objections of pundits requires high-touch, aggressive sales, contradicting the solo, introverted, part-time Commander profile.
What did we learn?
Engine verdict: GATHER_MORE_SIGNAL (WORTH_SKIMMING). Differentiated trust infrastructure, but no real buyer proof and draft-sharing/accountability aversion make this premature.

Filter scores

Five axes, each scored 0-3. Three independent runs by different model perspectives. Median shown.

AxisWhat it measures
data moatDoes this product accumulate proprietary data that compounds?
10x model testDoes a better model make this more valuable, or redundant?
fast feedback loopsCan outputs be graded against reality in <30 days?
solo founder feasibleCan a solo operator build and run this without a team?
AI providers cant eat itDo hyperscalers have structural reasons NOT to build this?
Composite median: 11.5 / 15. Graduation threshold: 9.0. IQR across runs: 1.0.

Evidence

Signal B — Competitor with documented gap

Existing forecasting competitors focus on forecast accuracy comparisons after the fact, not on pre-publication claim extraction, forced resolution criteria, transmission assumptions, and a living per-analyst accuracy ledger for newsletter/research publishing workflows.

Signal D — Demand proxy

{"summary":"There are indirect signs that paid research/newsletter buyers care about credibility and evaluation: a Reddit user reports spending $9,600/year across 23 paid investing newsletters and not knowing which are worth the money, while adjacent communities are actively benchmarking model/forecast performance.","sources":["https://www.reddit.com/r/ValueInvesting/comments/1rg7muf/i_spent_9600year_on_substack_newsletters_so_you.json","https://www.reddit.com/r/LocalLLaMA/comments/1raa7jm/we_benchmarked_9_llm_models_for_stock_direction/"]}

Evaluation history

WhenStagePhase
2026-04-19 17:43deep_council_verdictgraduated
2026-04-19 17:36deep_claude_takegraduated
2026-04-19 17:34deep_90day_plangraduated
2026-04-19 17:19deep_riskgraduated
2026-04-19 17:10deep_distributiongraduated
2026-04-19 16:44deep_pricinggraduated
2026-04-19 16:24deep_moatgraduated
2026-04-19 16:19deep_buyer_simgraduated
2026-04-19 16:12deep_icpgraduated
2026-04-19 16:03deep_competitorgraduated
2026-04-19 15:54deep_market_realitygraduated
2026-04-19 15:30filter_scorescored
2026-04-19 15:20filter_scorescored
2026-04-19 15:10filter_scorescored
2026-04-19 15:00evidence_searchevidence_hunt
2026-04-19 14:50evidence_searchargument
2026-04-19 14:40audience_simulationargument
2026-04-19 14:30red_team_killargument
2026-04-19 14:20steelmanargument
2026-04-19 14:10genesisargument