← all hypothesesEngineering Decision Ledger for Startup CTOs and Engineering Leads
graduated [S] filter 9.5/15 spread ±1.0 signals: 2 independent
What is this?
An internal decision-quality system for founder-led SaaS engineering teams that turns technical proposals into tracked, reality-graded commitments. Instead of focusing only on rare migrations, the product covers recurring decisions: framework upgrades, vendor switches, reliability fixes, build-vs-buy calls, performance work, hiring/process changes, and major feature architecture choices. Teams submit a short structured decision brief with claims, assumptions, expected gains, risks, and kill criteria. AE runs adversarial challenge, forces premise-conclusion separation, and stores the decision as a living record with lifecycle states. Within weeks or months, outcomes are graded against reality: delivery date hit/miss, incident impact, rollback, cost change, latency change, support burden, or adoption. The result is a decision ledger showing which teams, leaders, and argument patterns are reliable, where overconfidence recurs, and which assumptions routinely fail. This is not a chatbot or architecture reviewer; it is a written operating layer for engineering judgment, using AE's prediction grading, autopsy taxonomy, and constraint language to create compounding organizational memory and accountability.
Why did we consider it?
AE is well suited to become an internal engineering decision ledger because it turns recurring technical judgment into structured commitments that can later be graded against reality, creating clear ROI and a viable low-volume, high-ACV business.
What breaks?
- Developer Mutiny: Engineers already hate writing Jira tickets and ADRs; forcing them to write structured briefs for routine decisions introduces fatal friction.
- Attribution Noise: Missed deadlines or latency issues are often caused by shifting business requirements or team turnover, making objective grading of the original technical decision impossible.
- Change Management Mismatch: Implementing a culture-altering governance and accountability tool requires high-touch enterprise sales and onboarding, which a part-time solo founder cannot provide.
What did we learn?
Engine verdict: GATHER_MORE_SIGNAL (WORTH_SKIMMING). Promising pain, but the core grading loop and willingness to be judged are unproven—sell manual pilots before writing product code.
Filter scores
Five axes, each scored 0-3. Three independent runs by different model perspectives. Median shown.
| Axis | What it measures |
|---|
| data moat | Does this product accumulate proprietary data that compounds? |
| 10x model test | Does a better model make this more valuable, or redundant? |
| fast feedback loops | Can outputs be graded against reality in <30 days? |
| solo founder feasible | Can a solo operator build and run this without a team? |
| AI providers cant eat it | Do hyperscalers have structural reasons NOT to build this? |
Composite median: 9.5 / 15. Graduation threshold: 9.0. IQR across runs: 1.0.
Evidence
Signal B — Competitor with documented gap
Tenet AI is a decision ledger for AI agents focused on auditability and replaying reasoning steps, not for human engineering teams tracking recurring technical decisions like framework upgrades or vendor switches with reality-based outcome grading (delivery hit/miss, cost changes); lacks support for startup CTOs' build-vs-buy calls, adversarial challenges, or assumption autopsies.
Signal D — Demand proxy
{"found":true,"summary":"Forum discussions on Reddit (r/softwarearchitecture) highlight pains with ADR tracking in Confluence lacking code integration and outcome evaluation; HN/Reddit threads note difficulties repeating mistakes without historical grading; arxiv position paper identifies epistemic gaps in AI-assisted decisions needing decay tracking.","sources":["https://www.reddit.com/r/softwarearchitecture/comments/1dfo8tz/documenting_architecture_decision_records","https://arxiv.org/abs/2601.21116","https://www.reddit.com/r/ExperiencedDevs/comments/1fabmv9/how_do_you_make_complex_technical…
Evaluation history
| When | Stage | Phase |
|---|
| 2026-04-20 22:21 | deep_council_verdict | graduated |
| 2026-04-20 22:10 | deep_claude_take | graduated |
| 2026-04-20 22:08 | deep_90day_plan | graduated |
| 2026-04-20 21:58 | deep_risk | graduated |
| 2026-04-20 21:49 | deep_distribution | graduated |
| 2026-04-20 21:42 | deep_pricing | graduated |
| 2026-04-20 21:29 | deep_moat | graduated |
| 2026-04-20 21:20 | deep_buyer_sim | graduated |
| 2026-04-20 21:12 | deep_icp | graduated |
| 2026-04-20 21:02 | deep_competitor | graduated |
| 2026-04-20 20:38 | deep_market_reality | graduated |
| 2026-04-20 20:20 | filter_score | scored |
| 2026-04-20 20:10 | filter_score | scored |
| 2026-04-20 20:00 | filter_score | scored |
| 2026-04-20 19:50 | evidence_search | argument |
| 2026-04-20 19:40 | audience_simulation | argument |
| 2026-04-20 18:00 | red_team_kill | argument |
| 2026-04-20 17:54 | steelman | argument |
| 2026-04-20 17:11 | genesis | argument |