A Comprehensive Survey of the LLM-Based Agent: The Contextual Cognition Perspective

Large language model (LLM)-based agents have given rise to phenomenal applications (e.g., OpenClaw, Claude Code), transitioning from fixed text processing to complex task execution. However, most existing works conceptualize the LLM-based agent by decomposing the whole system into modules such as planning, action, reflection, and memory, thereby lacking a unified perspective to explain the emergence of agentic intelligence. In this survey, we present a novel perspective by framing agentic intelligence through the lens of contextual cognition. We propose that an advanced agent fundamentally relies on a unified framework comprising four core processes: contextual encoding, perception, interaction, and reasoning. Within this framework, we reveal that the emergence of agentic intelligence stems not merely from the organization of diverse modules, but from how the agent manages and, especially, interacts with contextuality, where contextuality is defined as the dynamic integration of external observations and the LLMs’ internal states. Furthermore, we systematically review current methods for constructing agents from the contextual cognition perspective, encompassing agent runtime orchestration and foundation LLM training. We also revisit corresponding benchmarks and applications, such as deep research, coding, GUI, and scientific agents. Finally, we discuss critical open challenges and outline future research trends, providing a roadmap for overcoming current cognitive bottlenecks and fostering contextualized agentic systems. We hope this perspective serves as an alternative framework to analyze agent construction through contextual cognition and guide the future development of LLM-based agents.

Liked Liked