← all hypothesesCommitment Audit for Boutique Software Agencies
graduated [B] filter 10.0/15 spread ±1.0 signals: 2 independent
What is this?
A pre-send commitment audit for boutique software agencies that reviews proposals, SOWs, and estimates for unsupported promises before they reach the client. Instead of predicting eventual project profitability, the product identifies immediate, objective failure modes in the artifact itself: conclusions not supported by stated scope, missing exclusions, contradictory assumptions, timeline claims without dependency coverage, and confidence language that outruns the evidence provided. The agency submits the outbound document plus a compact constraint sheet covering team shape, delivery model, known unknowns, excluded work, and acceptable certainty range. AE runs adversarial debate over the commitments, then produces a structured red-team report with required revisions, explicit assumptions, and downgrade/kill recommendations for claims that cannot be defended. The grading loop is based on fast artifact-level outcomes, not months-later delivery results: whether flags were accepted, whether the proposal was revised, whether unsupported claims were removed, and whether client clarification requests matched the flagged gaps. This makes the product a defensible proposal-risk gate, not a project-margin oracle.
Why did we consider it?
A Commitment Audit is compelling because it solves an immediate, high-cost, pre-sale failure mode for boutique agencies with fast, objective artifact-level feedback rather than vague long-horizon project predictions.
What breaks?
- Incentive misalignment: Agencies overpromise to win competitive pitches; a tool demanding realistic constraints acts as a 'sales prevention' bottleneck.
- The 'Shelfware Audit' trap: Real-world evidence shows analytical audits are routinely ignored if they require painful implementation or risk immediate revenue.
- Self-defeating feedback loop: The fast grading mechanism (tracking if flags were accepted) will merely document the agency's refusal to downgrade their sales claims.
What did we learn?
Engine verdict: GATHER_MORE_SIGNAL (WORTH_SKIMMING). Real pain and whitespace, but no proof agencies will pay to remove the ambiguity that helps them win deals.
Filter scores
Five axes, each scored 0-3. Three independent runs by different model perspectives. Median shown.
| Axis | What it measures |
|---|
| data moat | Does this product accumulate proprietary data that compounds? |
| 10x model test | Does a better model make this more valuable, or redundant? |
| fast feedback loops | Can outputs be graded against reality in <30 days? |
| solo founder feasible | Can a solo operator build and run this without a team? |
| AI providers cant eat it | Do hyperscalers have structural reasons NOT to build this? |
Composite median: 10.0 / 15. Graduation threshold: 9.0. IQR across runs: 1.0.
Evidence
Signal B — Competitor with documented gap
Vern is positioned as AI contract review and clause risk analysis for signed/negotiated contracts, not as a pre-send proposal/SOW/estimate commitment audit for software agencies that checks unsupported delivery promises, scope-evidence mismatches, missing exclusions, contradictory assumptions, or timeline claims without dependency coverage.
Signal D — Demand proxy
{"summary":"Indirect evidence exists that freelancers/agencies experience scope creep, scattered scope changes, and client-project friction tied to unclear commitments and documentation; there are also public SOW examples/templates indicating ongoing interest in SOW structure.","sources":["https://www.reddit.com/r/FreelanceProgramming/comments/1r7cmzl/how_do_you_handle_scope_creep_and_late_payments.json","https://www.reddit.com/r/webdev/comments/1htqvcs/i_just_had_the_worst_experience_with_a_client_it/","https://github.com/joelparkerhenderson/statement-of-work/blob/main/README.md","https://git…
Evaluation history
| When | Stage | Phase |
|---|
| 2026-04-19 18:32 | deep_council_verdict | graduated |
| 2026-04-19 18:24 | deep_claude_take | graduated |
| 2026-04-19 18:21 | deep_90day_plan | graduated |
| 2026-04-19 18:11 | deep_risk | graduated |
| 2026-04-19 18:02 | deep_distribution | graduated |
| 2026-04-19 17:54 | deep_pricing | graduated |
| 2026-04-19 17:43 | deep_moat | graduated |
| 2026-04-19 17:37 | deep_buyer_sim | graduated |
| 2026-04-19 17:30 | deep_icp | graduated |
| 2026-04-19 17:17 | deep_competitor | graduated |
| 2026-04-19 17:07 | deep_market_reality | graduated |
| 2026-04-19 16:50 | filter_score | scored |
| 2026-04-19 16:40 | filter_score | scored |
| 2026-04-19 16:30 | filter_score | scored |
| 2026-04-19 16:20 | evidence_search | argument |
| 2026-04-19 16:10 | audience_simulation | argument |
| 2026-04-19 16:00 | red_team_kill | argument |
| 2026-04-19 15:50 | steelman | argument |
| 2026-04-19 15:40 | genesis | argument |