Self-Evolving Agent Engineering for Healthcare: Methodologies and Applications

Clinical large language model (LLM) agents are increasingly engineered as systems that combine a language model backbone with memory, tools, orchestration loops, and feedback mechanisms. In this setting, the key engineering question is no longer only what the backbone model can answer, but how the surrounding harness stores experience, retrieves context, orchestrates tools, and converts feedback into reusable knowledge. Existing reviews of LLM agents in healthcare primarily emphasise prompting strategies, task capabilities, and benchmark performance, leaving these harness-level mechanisms insufficiently synthesised. This review addresses that gap by organising the emerging literature under the term Self-Evolving Agent Engineering (SEAE), defined here as a harness-level design paradigm centred on three recurring mechanisms: persistent cross-session memory, autonomous skill or experience synthesis, and closed-loop feedback-driven improvement. We review 148 references and 23 representative clinical systems across six task categories, using radiology as the main translational focus. Rather than treating these systems as isolated applications, we map how persistent memory, skill synthesis, tool orchestration, and feedback-driven improvement are implemented across current healthcare agents, and examine the technical, clinical, and regulatory challenges that arise when clinical agents are designed to evolve across sessions.

Liked Liked