← all hypothesesExternal Research Provider Scorecards for Allocators and Family Offices
graduated [S] filter 10.5/15 spread ±1.0 signals: 2 independent
What is this?
AE becomes a vendor-evaluation system for buy-side teams, family offices, and research-consuming allocators that rely on external newsletters, boutique research shops, and independent analysts. Instead of asking publishers to self-impose accountability, the buyer uploads research notes, stated calls, and follow-up updates from providers they already pay. AE extracts explicit and implicit forecasts, normalizes them into structured claim receipts, adversarially tests the reasoning for failure patterns like premise-conclusion severing and concession laundering, and then grades outcomes against public market reality over time. The output is a provider scorecard: hit-rate by horizon, calibration quality, recurring failure modes, narrative drift, and whether updates genuinely acknowledge disconfirmation or merely relabel it. This aligns incentives correctly: the customer is not the guru being exposed, but the party deciding who deserves budget, trust, and continued attention. AE's strongest assets map directly here: objective reality-graded signal, fast feedback loops, portable evidence trails, and a taxonomy built to distinguish genuine edge from persuasive but weakly grounded market commentary.
Why did we consider it?
AE has a credible niche as an independent scorecard system for external research providers because it fills a real accountability gap in allocator diligence with objective, portable, outcome-graded evidence.
What breaks?
- Financial research relies on narratives and frameworks, lacking the explicit, falsifiable claims AE needs to grade reality.
- Institutional allocators buy research for idea generation and risk perspective, not raw prediction hit-rates (which SPIVA already proves are poor).
- Enterprise sales to family offices and buy-side teams require high-touch, relationship-driven networking, incompatible with a solo, introverted, part-time founder.
What did we learn?
Engine verdict: GATHER_MORE_SIGNAL (WORTH_SKIMMING). Clear wedge and plausible economics, but zero real buyer proof—sell a paid audit before building software.
Filter scores
Five axes, each scored 0-3. Three independent runs by different model perspectives. Median shown.
| Axis | What it measures |
|---|
| data moat | Does this product accumulate proprietary data that compounds? |
| 10x model test | Does a better model make this more valuable, or redundant? |
| fast feedback loops | Can outputs be graded against reality in <30 days? |
| solo founder feasible | Can a solo operator build and run this without a team? |
| AI providers cant eat it | Do hyperscalers have structural reasons NOT to build this? |
Composite median: 10.5 / 15. Graduation threshold: 9.0. IQR across runs: 1.0.
Evidence
Signal B — Competitor with documented gap
StarMine provides objective, quantitative rankings of sell-side analysts based on historical forecast accuracy, recommendation returns, and earnings estimates, used by buy-side to identify top performers, but lacks buyer-uploaded private research notes from newsletters/boutiques/independents, adversarial reasoning tests (premise-conclusion severing, concession laundering), calibration quality, narrative drift, or update acknowledgment analysis; it focuses on public sell-side data without custom scorecard generation for allocators' paid providers.
Signal D — Demand proxy
{"found":true,"summary":"Discussions and articles highlight buy-side/allocators valuing sell-side analyst performance tracking via tools like StarMine/TipRanks for accuracy/hit rates, with family offices using scorecards for investments but needing better external research evaluation; peer-reviewed papers confirm academic interest in analyst grading.","sources":["https://www.lseg.com/en/data-analytics/starmine-forecaster-awards","https://lipperalpha.refinitiv.com/2018/05/analyzing-the-analysts-starmine-improves-valuation-models","https://link.springer.com/article/10.1007/s10693-016-0258-x","ht…
Evaluation history
| When | Stage | Phase |
|---|
| 2026-04-20 20:22 | deep_council_verdict | graduated |
| 2026-04-20 20:15 | deep_claude_take | graduated |
| 2026-04-20 20:12 | deep_90day_plan | graduated |
| 2026-04-20 20:01 | deep_risk | graduated |
| 2026-04-20 19:49 | deep_distribution | graduated |
| 2026-04-20 19:38 | deep_pricing | graduated |
| 2026-04-20 19:29 | deep_moat | graduated |
| 2026-04-20 19:21 | deep_buyer_sim | graduated |
| 2026-04-20 19:12 | deep_icp | graduated |
| 2026-04-20 19:02 | deep_competitor | graduated |
| 2026-04-20 18:49 | deep_market_reality | graduated |
| 2026-04-20 18:30 | filter_score | scored |
| 2026-04-20 18:20 | filter_score | scored |
| 2026-04-20 18:10 | filter_score | scored |
| 2026-04-20 18:09 | evidence_search | evidence_hunt |
| 2026-04-20 12:10 | evidence_search | evidence_hunt |
| 2026-04-20 12:00 | evidence_search | evidence_hunt |
| 2026-04-20 11:50 | evidence_search | evidence_hunt |
| 2026-04-20 11:40 | evidence_search | evidence_hunt |
| 2026-04-20 11:30 | evidence_search | evidence_hunt |
| 2026-04-20 11:20 | evidence_search | evidence_hunt |
| 2026-04-20 11:10 | evidence_search | evidence_hunt |
| 2026-04-20 11:00 | evidence_search | evidence_hunt |
| 2026-04-20 10:50 | evidence_search | evidence_hunt |
| 2026-04-20 10:40 | evidence_search | evidence_hunt |
| 2026-04-20 10:30 | evidence_search | evidence_hunt |
| 2026-04-20 10:20 | evidence_search | evidence_hunt |
| 2026-04-20 10:10 | evidence_search | evidence_hunt |
| 2026-04-20 10:00 | evidence_search | evidence_hunt |
| 2026-04-20 09:50 | evidence_search | evidence_hunt |
| 2026-04-20 09:40 | evidence_search | evidence_hunt |
| 2026-04-20 09:30 | evidence_search | evidence_hunt |
| 2026-04-20 09:20 | evidence_search | evidence_hunt |
| 2026-04-20 09:10 | evidence_search | evidence_hunt |
| 2026-04-20 09:00 | evidence_search | evidence_hunt |
| 2026-04-20 08:50 | evidence_search | evidence_hunt |
| 2026-04-20 08:40 | evidence_search | evidence_hunt |
| 2026-04-20 08:30 | evidence_search | evidence_hunt |
| 2026-04-20 08:20 | evidence_search | evidence_hunt |
| 2026-04-20 08:10 | evidence_search | evidence_hunt |
| 2026-04-20 08:00 | evidence_search | evidence_hunt |
| 2026-04-20 07:50 | evidence_search | evidence_hunt |
| 2026-04-20 07:40 | evidence_search | evidence_hunt |
| 2026-04-20 07:30 | evidence_search | evidence_hunt |
| 2026-04-20 07:20 | evidence_search | evidence_hunt |
| 2026-04-20 07:10 | evidence_search | evidence_hunt |
| 2026-04-20 07:00 | evidence_search | evidence_hunt |
| 2026-04-20 06:50 | evidence_search | evidence_hunt |
| 2026-04-20 06:40 | evidence_search | evidence_hunt |
| 2026-04-20 06:30 | evidence_search | evidence_hunt |
| 2026-04-20 06:20 | evidence_search | evidence_hunt |
| 2026-04-20 06:10 | evidence_search | evidence_hunt |
| 2026-04-20 06:00 | evidence_search | evidence_hunt |
| 2026-04-20 05:50 | evidence_search | evidence_hunt |
| 2026-04-20 05:40 | evidence_search | evidence_hunt |
| 2026-04-20 05:30 | evidence_search | evidence_hunt |
| 2026-04-20 05:20 | evidence_search | evidence_hunt |
| 2026-04-20 05:10 | evidence_search | evidence_hunt |
| 2026-04-20 05:00 | evidence_search | evidence_hunt |
| 2026-04-20 04:50 | evidence_search | evidence_hunt |
| 2026-04-20 04:40 | evidence_search | argument |
| 2026-04-20 04:30 | audience_simulation | argument |
| 2026-04-20 04:20 | red_team_kill | argument |
| 2026-04-20 04:10 | steelman | argument |
| 2026-04-20 04:00 | genesis | argument |