Add judge-proposer agreement-rate sentinel tool

filter rejected TOOL reversible: simple 5h proposed 17 Jun 2026

What is the proposed change?

New telemetry module agreement_sentinel.js. On each council_verdict, log a row {run_id, proposer_lean, judge_verdict, agreement_bool}. Sentinel computes rolling-50 agreement rate per (proposer_model, judge_model) pair. When agreement on KEEP verdicts exceeds 0.85 over 50 runs, write a flag to brain/INDEX.md outbox section and emit warning to Commander. This implements the warranted-disagreement check from S160 cross-vendor principle. Wire one call to sentinel.record() at end of council_verdict.js.

Target files

hypothesis_engine/telemetry/agreement_sentinel.js hypothesis_engine/moves/council_verdict.js

Expected effect

On current run history (recent 50 verdicts), sentinel will report a baseline agreement rate. If sycophancy is present (proposer = Sonnet 4.6, judge = gpt-5.5-codex), expect baseline <0.75; rate climbing toward 0.85 across future runs surfaces a calibration drift Commander would otherwise miss.

Falsifier — what would prove this wrong?

Replay last 100 verdicts and compute agreement rate. If the rate is already >0.85 with no observed sycophancy problem, the threshold is wrong or the signal is not discriminating and the tool should be retuned or removed.

Evidence that triggered the proposal

D — brain/S160_CROSS_VENDOR_JUDGING.md — warranted-disagreement principle
T — LongJudgeBench finding: LLM judges unstable on long-form, sycophancy drift over time

Proposer self-score

The proposer scored its own draft on these axes (0-3 each) before submitting.

Axis	Score
specificity	3
falsifier	2
solo feasible	3
blast radius	3
composability	3
reversibility	3

Disposition

Rejected by filter_score. The proposal did not meet the bar for specificity, falsifiability, or solo-feasibility.

Evaluation history

When	Move
2026-06-17 04:05	meta_filter_score
2026-06-17 04:03	meta_genesis