← all hypothesesPre-Hire Confidence Calibration for Heads of Talent on Senior IC Misfires
ranked [TRIANGULATED] filter 8.5/15 spread ±1.0 signals: 3 independent
What is this?
A pre-hire confidence-grading service for heads of talent at UK/US 50-200 person product, engineering, and boutique consulting firms hiring senior ICs into £80-150k judgment-heavy roles. Before the offer is sent, the head of talent enters a confidence score and reasoning bullets predicting whether the finalist will survive 90 days without a PIP, formal decision-reversal, or exit. AE's adversarial multi-model debate generates a structured pre-offer challenge pack the firm runs through its existing interview loop; the HoT records what came back. At 90 days, AE grades pre-hire confidence against three objective HRIS-recorded events the HoT already pulls — PIP status, exit status, formally-documented decision reversal — never against subjective manager opinions. AE compounds this into a per-firm calibration profile: which pre-hire signals (challenge-pack performance, reasoning-bullet patterns) predict which objective 90-day outcomes for THIS firm. Hiring managers enter nothing; HoTs report on themselves using records that exist independently of AE.
Why did we consider it?
AE's reality-graded prediction engine maps 1:1 onto a publicly-acknowledged confidence gap in senior IC hiring, with HRIS-objective grading, single-buyer workflow, and pricing math that fits the solo founder's £100-300K ARR target.
What breaks?
- 11th-hour candidate friction: Adding a challenge pack at the finalist stage will cause £80-150k candidates to abandon the pipeline.
- Misaligned HoT incentives: Talent leaders are KPI'd on time-to-fill and acceptance rates, not 90-day retention (which is blamed on Hiring Managers).
- Sparse data volume: 50-200 person firms do not hire enough senior ICs annually to generate a statistically significant calibration profile.
What did we learn?
Still in evaluation (phase: ranked). No verdict yet.
Filter scores
Five axes, each scored 0-3. Three independent runs by different model perspectives. Median shown.
| Axis | What it measures |
|---|
| data moat | Does this product accumulate proprietary data that compounds? |
| 10x model test | Does a better model make this more valuable, or redundant? |
| fast feedback loops | Can outputs be graded against reality in <30 days? |
| solo founder feasible | Can a solo operator build and run this without a team? |
| AI providers cant eat it | Do hyperscalers have structural reasons NOT to build this? |
Composite median: 8.5 / 15. Graduation threshold: 9.0. IQR across runs: 1.0.
Evidence
Signal A — Primary source
We find that verbalized confidences emitted as output tokens are typically better-calibrated than the model's conditional probabilities.
Signal B — Competitor with documented gap
HackerEarth offers data-driven recruiting analytics (collecting and applying quantitative insights from talent data) but focuses on pre-hire assessment and screening metrics — no pre-offer confidence grading by the HoT, no adversarial challenge-pack generation, and no closed-loop calibration against 90-day HRIS-recorded outcomes (PIP, exit, decision-reversal).
Signal D — Demand proxy
{"found":true,"summary":"LinkedIn and HN discussions surface the exact pain points: recruiters confuse confidence with competence in senior hires, calibration failures in recruitment are recognized as systemic leadership problems, and managers hire defensively rather than for capability — all indicating latent demand for structured pre-hire confidence accountability.","sources":["https://www.linkedin.com/posts/tadthornton_the-infamous-false-calibration-activity-7452051459182944256-PkXI","https://www.linkedin.com/posts/patrick-wicker_the-biggest-hiring-mistake-leaders-dont-activity-742635188576…
Evaluation history
| When | Stage | Phase |
|---|
| 2026-05-10 00:12 | filter_score | scored |
| 2026-05-10 00:06 | filter_score | scored |
| 2026-05-09 23:54 | filter_score | scored |
| 2026-05-09 23:49 | evidence_search | evidence_hunt |
| 2026-05-09 23:42 | evidence_search | evidence_hunt |
| 2026-05-09 23:36 | evidence_search | evidence_hunt |
| 2026-05-09 23:25 | evidence_search | argument |
| 2026-05-09 23:18 | audience_simulation | argument |
| 2026-05-09 23:12 | red_team_kill | argument |
| 2026-05-09 23:06 | steelman | argument |
| 2026-05-09 23:02 | genesis | argument |