Faithfulness-Aware Decoding via Constrained Optimization for Multi-Document Summarization: Framework, Diagnosis, and Empirical Analysis

Multi-document summarization (MDS) under strict context budgets is acutely vulnerable to hallucination, cross-document contradiction, entity drift, and redundant paraphrasing. Existing models address these issues only implicitly through training objectives, leaving decoding as an ad-hoc pipeline layered on maximum-likelihood generation. Arguing that faithfulness is fundamentally a constraint satisfaction problem rather than a fluency optimization problem, we introduce FADCO (Faithfulness-Aware Decoding via Constrained Optimization). FADCO is a model-agnostic inference-time framework that encodes evidence grounding, non-contradiction, and redundancy control as explicit constraints within a Lagrangian-relaxed objective. To support this framework, we formally diagnose and resolve beam collapse, a failure mode in which standard beam search degrades constrained selection to greedy decoding, by employing a mixed candidate pool that increases verifier support diversity. Furthermore, we resolve log-probability scale dominance through rank-based multi-objective aggregation and introduce a bounded local repair operator with provable termination and edit minimality guarantees under a strict retry budget of R=3. Evaluated on MultiNews using MiniCheck-FlanT5, QA-F1, and NLI entailment, preliminary stratified validation demonstrates that bounded self-healing improves MiniCheck scores by 65.7 percent and NLI entailment by 59.0 percent, while reducing contradictions by 5.8 percent. These gains incur only a negligible 0.69 percent ROUGE-1 trade-off, demonstrating a highly favorable faithfulness-fluency Pareto frontier for inference-time decoding interventions.

Liked Liked