Review Beats Planning: Dual-Model Interaction Patterns for Code Synthesis
arXiv:2603.03406v1 Announce Type: new Abstract: How should two language models interact to produce better code than either can alone? The conventional approach — a reasoning model plans, a code specialist implements — seems natural but fails: on HumanEval+, plan-then-code degrades performance by 2.4 percentage points versus the code specialist alone. We show that reversing the interaction changes everything. When the code specialist generates freely and the reasoning model reviews instead of plans, the same two models on the […]