Add pre-council self-attack skill to prune weak hypotheses before council budget spend

filter rejected SKILL reversible: simple 6h proposed 7 Jun 2026

What is the proposed change?

Add a `self_attack` move that runs AFTER argument but BEFORE council_verdict. The move asks the same author model that produced the hypothesis to (1) write the single strongest fatal objection against its own hypothesis, (2) write its best defense, (3) output a defensibility_score 0–3. Route hypotheses with defensibility_score ≤ 1 to revise (one round) or kill with reason `self_attack_collapse`, skipping council_verdict. Persist {hypothesis_id, objection, defense, defensibility_score, council_outcome_if_run} for calibration.

Target files

hypothesis_engine/moves/self_attack.js hypothesis_engine/pipeline.js

Expected effect

Of the 2× fatal_objection_both_confirm kills/week, ~60% are caught at self_attack instead, freeing council budget. Council budget per surviving hypothesis rises measurably (>15%). Self_attack defensibility_score correlates positively (r > 0.4) with council survival.

Falsifier — what would prove this wrong?

After 20 hypotheses, if (a) fatal_objection_both_confirm at council does not drop, OR (b) defensibility_score correlates with council outcome at r < 0.4, the skill is not learning the same fatal-objection signal council uses and should be removed.

Evidence that triggered the proposal

E — kill_reason_distribution_7d: fatal_objection_both_confirm = 2
D — council_verdict architecture: dual-judge fatal-objection confirm
D — S178-S179 handoff contracts (atomic move wrappers)

Proposer self-score

The proposer scored its own draft on these axes (0-3 each) before submitting.

Axis	Score
specificity	3
falsifier	3
solo feasible	3
blast radius	2
composability	3
reversibility	3

Disposition

Rejected by filter_score. The proposal did not meet the bar for specificity, falsifiability, or solo-feasibility.

Evaluation history

When	Move
2026-06-12 04:23	meta_filter_score
2026-06-07 04:04	meta_genesis