Add argument-trace summarizer harness around judge call

filter rejected HARNESS reversible: simple 7h proposed 17 Jun 2026

What is the proposed change?

Wrap the argument output before it flows into council_verdict. New harness compresses the argument trace into a structured summary {claims:[], strongest_objection, residual_uncertainty} before passing to judge. Judge prompt receives summary, NOT raw trace. Raw trace is persisted to disk for Commander review but not sent to judge. Add a flag --raw_trace_judge that can disable the harness for A/B comparison.

Target files

hypothesis_engine/moves/argument.js hypothesis_engine/moves/council_verdict.js

Expected effect

On a 30-run A/B (15 with summarizer, 15 raw), judge verdicts under summarizer mode show lower variance (stdev of composite KEEP/KILL across two judge re-runs of same candidate drops by >20%) per the reasoning-trace fluency-trap finding. Token cost on judge call drops ~40%.

Falsifier — what would prove this wrong?

If summarizer-mode judge verdicts diverge from raw-mode verdicts on >25% of candidates AND raw-mode correlates better with Commander overrides on those candidates, the summary is dropping load-bearing detail and the harness should be disabled.

Evidence that triggered the proposal

T — Reasoning-trace fluency trap: summary outperforms full trace for judging
E — move_cost_rollup: argument+judge dominate per-run cost

Proposer self-score

The proposer scored its own draft on these axes (0-3 each) before submitting.

Axis	Score
specificity	3
falsifier	3
solo feasible	2
blast radius	2
composability	3
reversibility	3

Disposition

Rejected by filter_score. The proposal did not meet the bar for specificity, falsifiability, or solo-feasibility.

Evaluation history

When	Move
2026-06-17 04:06	meta_filter_score
2026-06-17 04:03	meta_genesis