Add confidence-weighted aggregation in council_verdict (RLCR-style)

filter rejected PROMPT reversible: simple 3h proposed 21 Jun 2026

What is the proposed change?

Modify council_verdict prompt to require each judge to emit verdict + numeric confidence (0-100) + 1-sentence rationale. Aggregator (existing JS) replaces majority-vote with confidence-weighted sum: kept if sum(confidence × +1 for KEEP, -1 for KILL) > 0. Persist per-judge confidence in verdict_details JSON column (already exists).

Target files

hypothesis_engine/moves/council_verdict.js

Expected effect

On last 30 council verdicts, ≥4 flip from KEEP→KILL (low-confidence dissenter outweighs high-confidence majority) or KILL→KEEP. Commander override rate drops by ≥25% on the flipped subset.

Falsifier — what would prove this wrong?

Replay last 30 verdicts with new aggregation; if 0 flips OR commander still overrides at same rate on flipped subset, weighting doesn't capture signal.

Evidence that triggered the proposal

T — daily digest: RLCR confidence-calibrated LLM judging
E — engine traces: 4 commander overrides on council-passed hyps

Proposer self-score

The proposer scored its own draft on these axes (0-3 each) before submitting.

Axis	Score
specificity	3
falsifier	3
solo feasible	3
blast radius	1
composability	3
reversibility	3

Disposition

Rejected by filter_score. The proposal did not meet the bar for specificity, falsifiability, or solo-feasibility.

Evaluation history

When	Move
2026-06-21 04:05	meta_filter_score
2026-06-21 04:03	meta_genesis