From Terrain to Space: A Survey on Multi-Domain Data Lifecycle for Urban Embodied Agents
Urban Embodied Agents (UrbanEAs) are emerging to interact with complex, large-scale city environments, generating vast, heterogeneous data streams. While embodied agent research has focused on controlled indoor environments, these settings lack the complexity of the physical world. In contrast, urban environments present distinct challenges, including environmental variability, limited observability, and interaction complexity. These challenges hinder the effectiveness of conventional agents. Therefore, establishing a comprehensive data lifecycle to fuse multi-domain data from terrain, aerial, and space is an essential strategy for developing actionable embodied capabilities from raw urban streams. Distinct from existing surveys that follow a model-centric paradigm for urban computing, we systematically propose and review a comprehensive Data Lifecycle from a multi-domain data perspective, which is essential for the UrbanEA. First, we propose a unified framework containing four key stages of this lifecycle: Data Perception, Data Management, Data Fusion, and Task Application. Next, we establish a taxonomy for each stage of the lifecycle. Finally, we outline the social impact of the data lifecycle of UrbanEA and open research problems. Our survey provides a rigorous roadmap for designing the robust, high-performance data frameworks essential for these UrbanEAs.