← all hypotheses

Reality-Graded Question QA for ACA Tuition Providers

graduated [A] filter 10.0/15 spread ±1.5 signals: 2 independent
What is this?
A private quality-assurance system for independent ACA exam trainers and small tuition firms that use AI to draft mock questions, worked answers, and marking guidance. Instead of trying to infer whether materials improve retention from noisy student outcomes, the product grades each asset against objective, inspectable realities: syllabus mapping, numerical correctness, answer-key consistency, mark-scheme alignment, ambiguity, answer leakage, unsupported explanations, and false confidence. AE fits because it can run adversarial multi-model debate over each draft, enforce structured behavioral contracts for what a valid ACA question or explanation must contain, and then score outputs against deterministic checks plus human-verifiable grading rubrics within hours. The deliverable is not a platform students use; it is a back-office content gate that flags unsafe or low-quality AI-generated materials before release. Buyers pay to reduce reputational risk, tutor review time, and the chance of distributing flawed mocks or explanations that confuse students or mis-teach exam technique.
Why did we consider it?
A private, reality-graded QA gate for AI-generated ACA tuition materials is defensible because it solves a concrete reputational-risk problem with auditable checks and fast ROI, in a market increasingly pressured to prove quality and value.
What breaks?
  • Microscopic TAM: The UK ACA market is dominated by BPP/Kaplan; independent trainers lack the volume and budget to support £100K-£300K ARR.
  • Workflow Mismatch: ACA prep relies on official ICAEW past papers, making net-new mock generation too infrequent to justify a recurring QA subscription.
  • ROI Collapse: Flagging a complex accounting error doesn't fix it; tutors still bear the heavy cognitive load of rewriting the integrated scenario.
What did we learn?
Engine verdict: GATHER_MORE_SIGNAL (WORTH_SKIMMING). Credible QA wedge, but prove recurring paid pain in ACA providers before building past a manual audit.

Filter scores

Five axes, each scored 0-3. Three independent runs by different model perspectives. Median shown.

AxisWhat it measures
data moatDoes this product accumulate proprietary data that compounds?
10x model testDoes a better model make this more valuable, or redundant?
fast feedback loopsCan outputs be graded against reality in <30 days?
solo founder feasibleCan a solo operator build and run this without a team?
AI providers cant eat itDo hyperscalers have structural reasons NOT to build this?
Composite median: 10.0 / 15. Graduation threshold: 9.0. IQR across runs: 1.5.

Evidence

Signal B — Competitor with documented gap

QuestionForge and similar tools focus on generating exam questions from source materials and rely on users to 'review' and 'refine' outputs, but the hypothesis is for a back-office QA gate that independently audits syllabus mapping, numerical correctness, answer-key consistency, mark-scheme alignment, ambiguity, leakage, unsupported explanations, and false confidence before release.

Signal D — Demand proxy

{"summary":"Primary institutional documents confirm a detailed ACA syllabus and a public procurement for ACA training, while adjacent market tools and open-source evaluator projects indicate activity around AI-generated assessments and AI output evaluation.","sources":["https://www.icaew.com/-/media/corporate/files/learning-and-development/next-generation-aca/aca-syllabus-handbook.ashx","https://www.find-tender.service.gov.uk/Notice/032453-2022/PDF","https://questionforge.ai/","https://www.cogniguide.app/quizzes/exam-paper-marking-scheme","https://github.com/learning-commons-org/evaluators/blo…

Evaluation history

WhenStagePhase
2026-04-19 05:09deep_council_verdictgraduated
2026-04-19 04:57deep_claude_takegraduated
2026-04-19 04:55deep_90day_plangraduated
2026-04-19 04:35deep_riskgraduated
2026-04-19 04:29deep_distributiongraduated
2026-04-19 04:20deep_pricinggraduated
2026-04-19 04:10deep_moatgraduated
2026-04-19 04:04deep_buyer_simgraduated
2026-04-19 03:58deep_icpgraduated
2026-04-19 03:48deep_competitorgraduated
2026-04-19 03:37deep_market_realitygraduated
2026-04-19 03:20filter_scorescored
2026-04-19 03:10filter_scorescored
2026-04-19 03:00filter_scorescored
2026-04-19 02:50evidence_searchevidence_hunt
2026-04-19 02:40evidence_searchargument
2026-04-19 02:30audience_simulationargument
2026-04-19 02:20red_team_killargument
2026-04-19 02:10steelmanargument
2026-04-19 02:00genesisargument