← all meta proposals

Add Gemini as concurrent second judge to meta_filter_score (require both KEEP)

council rejected HARNESS reversible: simple 3h proposed 23 Jun 2026
What is the proposed change?
In scoreProposal (currently single-LLM Codex gpt-5.5), call llm.callGemini with the same FILTER_SYSTEM and userPrompt in parallel via Promise.all. Parse both. Verdict mapping: KEEP iff both judges return KEEP; otherwise DROP with reason 'disagreement: codex=<v> gemini=<v> codex_reason=<...> gemini_reason=<...>'. Persist both responses in moveFields.output as {codex, gemini, verdict, disagreement}. Preserve the existing transition wrapper and dry-run path. Wrap behind META_DUAL_JUDGE_DISABLE env flag for rollback. Matches the dual-LLM dialectic already used in hypothesis_engine/moves/filter_score.js (high=Opus, low=Gemini).
Target files
meta_engine/moves/filter_score.js
Expected effect
KEEP rate drops 15-40% vs single-judge baseline. Council-stage and Commander-stage rejection rate on meta proposals that survived filter drops measurably, because surviving proposals cleared two independent vendor families.
Falsifier — what would prove this wrong?
After 30 days, compare Commander-reject rate on dual-judged meta proposals to the prior single-judge cohort. If Commander-reject rate is unchanged or KEEP rate did not drop, the asymmetry was not the constraint and harness should be removed.
Evidence that triggered the proposal
  • E — meta_engine/moves/filter_score.js:104-121 single-LLM scoreProposal
  • D — hypothesis_engine/moves/filter_score.js:121-124 established dual-vendor dialectic pattern
  • D — AE cross-vendor hygiene principle (Anthropic proposes, OpenAI judges)

Proposer self-score

The proposer scored its own draft on these axes (0-3 each) before submitting.

AxisScore
specificity3
falsifier2
solo feasible3
blast radius3
composability3
reversibility3
Disposition
Rejected at the council verdict. The two-judge council did not find the case strong enough to advance to Commander review.

Evaluation history

WhenMove
2026-06-23 04:13meta_council_verdict
2026-06-23 04:08meta_argument
2026-06-23 04:05meta_filter_score
2026-06-23 04:04meta_genesis