Beyond Closed-Pool Video Retrieval: A Benchmark and Agent Framework for Real-World Video Search and Moment Localization
arXiv:2602.10159v1 Announce Type: new Abstract: Traditional video retrieval benchmarks focus on matching precise descriptions to closed video pools, failing to reflect real-world searches characterized by fuzzy, multi-dimensional memories on the open web. We present textbf{RVMS-Bench}, a comprehensive system for evaluating real-world video memory search. It consists of textbf{1,440 samples} spanning textbf{20 diverse categories} and textbf{four duration groups}, sourced from textbf{real-world open-web videos}. RVMS-Bench utilizes a hierarchical description framework encompassing textbf{Global Impression, Key Moment, Temporal Context, and Auditory Memory} […]