← all hypothesesReality-Graded Question QA for ACA Tuition Providers
graduated [A] filter 10.0/15 spread ±1.5 signals: 2 independent
What is this?
A private quality-assurance system for independent ACA exam trainers and small tuition firms that use AI to draft mock questions, worked answers, and marking guidance. Instead of trying to infer whether materials improve retention from noisy student outcomes, the product grades each asset against objective, inspectable realities: syllabus mapping, numerical correctness, answer-key consistency, mark-scheme alignment, ambiguity, answer leakage, unsupported explanations, and false confidence. AE fits because it can run adversarial multi-model debate over each draft, enforce structured behavioral contracts for what a valid ACA question or explanation must contain, and then score outputs against deterministic checks plus human-verifiable grading rubrics within hours. The deliverable is not a platform students use; it is a back-office content gate that flags unsafe or low-quality AI-generated materials before release. Buyers pay to reduce reputational risk, tutor review time, and the chance of distributing flawed mocks or explanations that confuse students or mis-teach exam technique.
Why did we consider it?
A private, reality-graded QA gate for AI-generated ACA tuition materials is defensible because it solves a concrete reputational-risk problem with auditable checks and fast ROI, in a market increasingly pressured to prove quality and value.
What breaks?
- Microscopic TAM: The UK ACA market is dominated by BPP/Kaplan; independent trainers lack the volume and budget to support £100K-£300K ARR.
- Workflow Mismatch: ACA prep relies on official ICAEW past papers, making net-new mock generation too infrequent to justify a recurring QA subscription.
- ROI Collapse: Flagging a complex accounting error doesn't fix it; tutors still bear the heavy cognitive load of rewriting the integrated scenario.
What did we learn?
Engine verdict: GATHER_MORE_SIGNAL (WORTH_SKIMMING). Credible QA wedge, but prove recurring paid pain in ACA providers before building past a manual audit.
Filter scores
Five axes, each scored 0-3. Three independent runs by different model perspectives. Median shown.
| Axis | What it measures |
|---|
| data moat | Does this product accumulate proprietary data that compounds? |
| 10x model test | Does a better model make this more valuable, or redundant? |
| fast feedback loops | Can outputs be graded against reality in <30 days? |
| solo founder feasible | Can a solo operator build and run this without a team? |
| AI providers cant eat it | Do hyperscalers have structural reasons NOT to build this? |
Composite median: 10.0 / 15. Graduation threshold: 9.0. IQR across runs: 1.5.
Evidence
Signal B — Competitor with documented gap
QuestionForge and similar tools focus on generating exam questions from source materials and rely on users to 'review' and 'refine' outputs, but the hypothesis is for a back-office QA gate that independently audits syllabus mapping, numerical correctness, answer-key consistency, mark-scheme alignment, ambiguity, leakage, unsupported explanations, and false confidence before release.
Signal D — Demand proxy
{"summary":"Primary institutional documents confirm a detailed ACA syllabus and a public procurement for ACA training, while adjacent market tools and open-source evaluator projects indicate activity around AI-generated assessments and AI output evaluation.","sources":["https://www.icaew.com/-/media/corporate/files/learning-and-development/next-generation-aca/aca-syllabus-handbook.ashx","https://www.find-tender.service.gov.uk/Notice/032453-2022/PDF","https://questionforge.ai/","https://www.cogniguide.app/quizzes/exam-paper-marking-scheme","https://github.com/learning-commons-org/evaluators/blo…
Evaluation history
| When | Stage | Phase |
|---|
| 2026-04-19 05:09 | deep_council_verdict | graduated |
| 2026-04-19 04:57 | deep_claude_take | graduated |
| 2026-04-19 04:55 | deep_90day_plan | graduated |
| 2026-04-19 04:35 | deep_risk | graduated |
| 2026-04-19 04:29 | deep_distribution | graduated |
| 2026-04-19 04:20 | deep_pricing | graduated |
| 2026-04-19 04:10 | deep_moat | graduated |
| 2026-04-19 04:04 | deep_buyer_sim | graduated |
| 2026-04-19 03:58 | deep_icp | graduated |
| 2026-04-19 03:48 | deep_competitor | graduated |
| 2026-04-19 03:37 | deep_market_reality | graduated |
| 2026-04-19 03:20 | filter_score | scored |
| 2026-04-19 03:10 | filter_score | scored |
| 2026-04-19 03:00 | filter_score | scored |
| 2026-04-19 02:50 | evidence_search | evidence_hunt |
| 2026-04-19 02:40 | evidence_search | argument |
| 2026-04-19 02:30 | audience_simulation | argument |
| 2026-04-19 02:20 | red_team_kill | argument |
| 2026-04-19 02:10 | steelman | argument |
| 2026-04-19 02:00 | genesis | argument |