Language-Driven Image Restoration and Semantic-Aware Quality Assessment: A Survey
Image restoration aims to recover a high-quality image from its degraded counterpart by mitigating distortions introduced during acquisition, transmission, or environmental interaction. Despite the remarkable progress of deep learning–based restoration models, most conventional approaches remain tightly coupled to predefined degradation assumptions and pixel-level supervision, limiting their capability to handle complex and diverse scenarios or user-dependent restoration targets. Recent advances in multimodal large language models (MLLMs) and vision–language models (VLMs) have introduced a new paradigm in which restoration systems can incorporate semantic reasoning, language-driven interaction, and cross-modal knowledge. In these frameworks, language models extend restoration beyond purely visual reconstruction by enabling degradation interpretation, perceptual alignment, and high-level control. In this survey, we present a systematic review of language-integrated restoration frameworks, organizing existing studies through an interaction-centric taxonomy that captures distinct modes of interaction between language models and restoration networks. We investigate how semantic priors, textual guidance, perceptual supervision, and decision-centric mechanisms recast restoration behavior, and analyze the implications of these developments for model design and training strategies. In parallel, we review emerging language-driven image quality assessment approaches that complement traditional evaluation metrics. Finally, we identify unresolved challenges and outline potential research directions toward more robust, efficient, and trustworthy restoration techniques.