Forecast QA Gate for Niche Research Publishers

graduated [B] filter 11.5/15 spread ±1.0 signals: 2 independent

What is this?

A pre-publication quality gate for niche research publishers, independent analysts, and paid newsletter operators who publish claims about markets, companies, policy, or technology with outcomes that resolve in days to weeks. Authors submit a draft note plus cited sources; the system extracts concrete claims, pressure-tests them using AE's six-pattern autopsy taxonomy, and forces explicit resolution criteria, time windows, and transmission assumptions before publication. After release, claims are automatically graded against public evidence on the declared schedule, creating a living accuracy ledger for each publication and analyst. This is not generic editing or RAG: the product's value is turning fuzzy analyst prose into auditable, reality-graded claims and showing where failures came from via the taxonomy. Buyers are those who monetize trust directly through subscriptions or premium research and can use a public or client-shared track record as a commercial asset. The loop is fast enough when focused on short-horizon claims, and the output compounds into differentiated credibility rather than one-off copy polish.

Why did we consider it?

A forecast QA gate for niche research publishers is a credible wedge because it converts analyst trust into a measurable asset, fits a growing publishing market, and offers a differentiated capability generic AI tools do not: pre-publication falsifiability plus post-publication reality grading.

What breaks?

Incentive misalignment: Analysts monetize confident narratives and selective memory; an objective 'accuracy ledger' is a commercial liability that exposes them to subscriber churn.
Market priorities: Publishers are fighting existential threats from AI scraping and paywall fatigue (e.g., Nieman Lab, NZZ), making epistemic rigor a distant non-priority.
Commander-market mismatch: Overcoming the massive ego-driven objections of pundits requires high-touch, aggressive sales, contradicting the solo, introverted, part-time Commander profile.

What did we learn?

Engine verdict: GATHER_MORE_SIGNAL (WORTH_SKIMMING). Differentiated trust infrastructure, but no real buyer proof and draft-sharing/accountability aversion make this premature.

Filter scores

Five axes, each scored 0-3. Three independent runs by different model perspectives. Median shown.

Axis	What it measures
data moat	Does this product accumulate proprietary data that compounds?
10x model test	Does a better model make this more valuable, or redundant?
fast feedback loops	Can outputs be graded against reality in <30 days?
solo founder feasible	Can a solo operator build and run this without a team?
AI providers cant eat it	Do hyperscalers have structural reasons NOT to build this?

Composite median: 11.5 / 15. Graduation threshold: 9.0. IQR across runs: 1.0.

Evidence

Signal B — Competitor with documented gap

https://goodjudgment.substack.com/p/testing-polymarkets-most-accurate

Existing forecasting competitors focus on forecast accuracy comparisons after the fact, not on pre-publication claim extraction, forced resolution criteria, transmission assumptions, and a living per-analyst accuracy ledger for newsletter/research publishing workflows.

Signal D — Demand proxy

{"summary":"There are indirect signs that paid research/newsletter buyers care about credibility and evaluation: a Reddit user reports spending $9,600/year across 23 paid investing newsletters and not knowing which are worth the money, while adjacent communities are actively benchmarking model/forecast performance.","sources":["https://www.reddit.com/r/ValueInvesting/comments/1rg7muf/i_spent_9600year_on_substack_newsletters_so_you.json","https://www.reddit.com/r/LocalLLaMA/comments/1raa7jm/we_benchmarked_9_llm_models_for_stock_direction/"]}

Evaluation history

When	Stage	Phase
2026-04-19 17:43	deep_council_verdict	graduated
2026-04-19 17:36	deep_claude_take	graduated
2026-04-19 17:34	deep_90day_plan	graduated
2026-04-19 17:19	deep_risk	graduated
2026-04-19 17:10	deep_distribution	graduated
2026-04-19 16:44	deep_pricing	graduated
2026-04-19 16:24	deep_moat	graduated
2026-04-19 16:19	deep_buyer_sim	graduated
2026-04-19 16:12	deep_icp	graduated
2026-04-19 16:03	deep_competitor	graduated
2026-04-19 15:54	deep_market_reality	graduated
2026-04-19 15:30	filter_score	scored
2026-04-19 15:20	filter_score	scored
2026-04-19 15:10	filter_score	scored
2026-04-19 15:00	evidence_search	evidence_hunt
2026-04-19 14:50	evidence_search	argument
2026-04-19 14:40	audience_simulation	argument
2026-04-19 14:30	red_team_kill	argument
2026-04-19 14:20	steelman	argument
2026-04-19 14:10	genesis	argument