← all meta proposals

Per-axis judge-spread veto gate in filter_score graduation

filter rejected GATE reversible: simple 4h proposed 16 Jun 2026
What is the proposed change?
After the three-run loop completes and composite_rank_score is computed (around line 198-211), but before phase advances to 'ranked' (line 245-247), compute per-axis mean spread across the 3 runs from the already-stored runSpreads in moves.output_json. For each of the 5 axes, mean_spread = mean(|hi-lo|) across runs. If ANY axis has mean_spread >= 2.0, do not advance to 'ranked'; set phase='disputed' and write a moves row of move_type='spread_veto' with the offending axis name. Add a small reconciliation hook (re-runFilterScore with a 3rd judge tie-break) that is only invoked manually by Commander, not auto-loop. Composite_rank_score still computes for visibility but graduation is blocked.
Target files
hypothesis_engine/moves/filter_score.js
Expected effect
Hypotheses where HIGH and LOW judges disagree systematically on one filter (e.g. 'fast_feedback_loops' hi=3 lo=0, hi=3 lo=0, hi=2 lo=0 → mean_spread=2.67) will land in 'disputed' rather than 'ranked'. Back-of-envelope: ~15-25% of currently-ranked hypotheses have ≥1 axis with mean spread ≥2 (visible in moves.output_json). That cohort will redirect to disputed.
Falsifier — what would prove this wrong?
Replay the last 60 days of filter_score moves against the proposed gate. If <5% or >40% of historically-ranked hypotheses get re-classified as 'disputed', the threshold 2.0 is miscalibrated — either gate is no-op or it nukes the pipeline. Either invalidates the chosen threshold (proposal needs recalibration, not the mechanism itself).
Evidence that triggered the proposal
  • E — hypothesis_engine/moves/filter_score.js:162-179 (runSpreads computed and stored per run, but never gates graduation)
  • E — hypothesis_engine/moves/filter_score.js:6 ('absolute score floor enforced in graduation logic (elsewhere); this module just scores' — spread floor is not enforced anywhere)

Proposer self-score

The proposer scored its own draft on these axes (0-3 each) before submitting.

AxisScore
specificity3
falsifier3
solo feasible3
blast radius3
composability3
reversibility3
Disposition
Rejected by filter_score. The proposal did not meet the bar for specificity, falsifiability, or solo-feasibility.

Evaluation history

WhenMove
2026-06-16 04:05meta_filter_score
2026-06-16 04:04meta_genesis