Research Scope Reality Check for Boutique Custom Studies

graduated [A] filter 9.0/15 spread ±1.0 signals: 2 independent

What is this?

A post-sale scoping system for boutique B2B research firms that turns vague client asks into explicit, testable delivery commitments before fieldwork begins. Instead of policing sales language, it is used during kickoff and study design to pressure-test what is actually answerable, what evidence will be required, what turnaround is feasible, and which conclusions would be out of bounds without stronger inputs. AE applies its adversarial grading loop and failure taxonomy to the scope doc, methodology plan, and interim claims, flagging premise-conclusion severing, concession laundering, cosmetic confidence, and temporal blind spots before the team burns margin. The output is a client-safe scope memo with bounded claims, dependency flags, kill criteria, and escalation triggers. Resolution is much cleaner than proposal conversion: did the required inputs arrive, did the team deliver the scoped outputs on time, were caveats triggered, and did final conclusions stay within pre-registered evidence bounds. This uses AE as a reality-grading engine for delivery discipline, not as a sales brake.

Why did we consider it?

The best case is that boutique research firms already sell bespoke studies under uncertainty, and AE can become a premium post-sale scoping checkpoint that measurably reduces margin burn and overclaim risk.

What breaks?

Misaligned incentives: B2B research firms profit by selling authoritative narratives ('fables'), not bounded epistemological rigor.
Post-sale client friction: Presenting a client with a heavily caveated scope memo immediately after a confident sales pitch damages trust and threatens the account.
Misdiagnosed margin burn: Boutique agencies lose money to operational scope creep (endless client revisions), not epistemic overclaiming.

What did we learn?

Engine verdict: GATHER_MORE_SIGNAL (WORTH_SKIMMING). Real delivery pain, but too many unproven adoption and authority assumptions to justify software before paid live-study pilots.

Filter scores

Five axes, each scored 0-3. Three independent runs by different model perspectives. Median shown.

Axis	What it measures
data moat	Does this product accumulate proprietary data that compounds?
10x model test	Does a better model make this more valuable, or redundant?
fast feedback loops	Can outputs be graded against reality in <30 days?
solo founder feasible	Can a solo operator build and run this without a team?
AI providers cant eat it	Do hyperscalers have structural reasons NOT to build this?

Composite median: 9.0 / 15. Graduation threshold: 9.0. IQR across runs: 1.0.

Evidence

Signal A — Primary source

https://www.find-tender.service.gov.uk/Notice/004892-2021 credibility: medium

Research Marketplace 2 - Find a Tender

Signal D — Demand proxy

{"summary":"There are indirect signs that scoping and scope-creep problems are common in project-based client work, including Reddit discussions about scoping advice and repeated client scope expansion, plus checklist-style GitHub repos showing practitioner interest in structured project estimation/scoping.","sources":["https://www.reddit.com/r/projectmanagement/comments/9ob81x/need_some_advice_on_scoping_a_project/","https://www.reddit.com/r/FreelanceProgramming/comments/1r7cmzl/how_do_you_handle_scope_creep_and_late_payments.json","https://www.reddit.com/r/webdev/comments/1r5m4x8/freelancers…

Evaluation history

When	Stage	Phase
2026-04-19 14:09	deep_council_verdict	graduated
2026-04-19 13:58	deep_claude_take	graduated
2026-04-19 13:56	deep_90day_plan	graduated
2026-04-19 13:33	deep_risk	graduated
2026-04-19 13:25	deep_distribution	graduated
2026-04-19 13:17	deep_pricing	graduated
2026-04-19 13:08	deep_moat	graduated
2026-04-19 13:02	deep_buyer_sim	graduated
2026-04-19 12:56	deep_icp	graduated
2026-04-19 12:46	deep_competitor	graduated
2026-04-19 12:38	deep_market_reality	graduated
2026-04-19 12:20	filter_score	scored
2026-04-19 12:10	filter_score	scored
2026-04-19 12:00	filter_score	scored
2026-04-19 11:50	evidence_search	argument
2026-04-19 11:40	audience_simulation	argument
2026-04-19 11:30	red_team_kill	argument
2026-04-19 11:20	steelman	argument
2026-04-19 11:10	genesis	argument