Add v2_a11 evidence-density axis to filter_score

filter rejected AXIS reversible: simple 4h proposed 7 Jun 2026

What is the proposed change?

Add axis `a11_evidence_density` to the v2 composite. Scoring rubric 0–3: 0 = no external citations to non-AE-authored corpus items; 1 = one citation; 2 = ≥2 citations within the same domain (e.g. all arxiv, all SerpAPI); 3 = ≥2 citations crossing at least two distinct corpus sources (e.g. one arxiv + one digest external signal). Insert axis evaluation in filter_score.js alongside existing a1..a10; include in composite mean. No reweighting of existing axes.

Target files

hypothesis_engine/moves/filter_score.js

Expected effect

Hypotheses authored from a single internal hunch with no external grounding lose ~2 points of composite (drop one axis from 3 to 0). MANIFESTO-v4-aligned evidence-grounded proposals widen their composite lead over speculative ones by 1.5–2.5 points.

Falsifier — what would prove this wrong?

Sample 30 high-a11 (≥2) and 30 low-a11 (≤1) graduated hypotheses. Have Commander blind-rate each on the single question 'Is this real?' (0–3). If high-a11 and low-a11 distributions overlap (KS test p > 0.2), a11 does not track evidentiary realness and is misdesigned.

Evidence that triggered the proposal

D — MANIFESTO_v4 (Honest/Deflated)
D — forecaster v1 evidence-emergence grader spec
T — digest external signals corpus (NBJ/Willison analyses cited but inconsistently)

Proposer self-score

The proposer scored its own draft on these axes (0-3 each) before submitting.

Axis	Score
specificity	3
falsifier	3
solo feasible	3
blast radius	3
composability	3
reversibility	3

Disposition

Rejected by filter_score. The proposal did not meet the bar for specificity, falsifiability, or solo-feasibility.

Evaluation history

When	Move
2026-06-12 04:24	meta_filter_score
2026-06-07 04:04	meta_genesis