Temporal Adversarial Attacks on Time Series and Reinforcement Learning Systems: A Systematic Survey, Taxonomy, and Benchmarking Roadmap
Deep learning systems processing temporal and sequential data are increasingly deployed in safety-critical applications including healthcare monitoring, autonomous navigation, and algorithmic trading. However, these systems exhibit severe vulnerabilities to adversarial attacks—carefully crafted perturbations that cause systematic misclassification while remaining imperceptible. This paper presents a comprehensive systematic survey of adversarial attacks on time series classification, human activity recognition (HAR), and reinforcement learning (RL) systems, reviewing 127 papers published between 2019 and 2025 following PRISMA guidelines with documented inter-rater reliability (kappa = 0.83).We establish a unified four-dimensional taxonomy distinguishing attack characteristics across target modalities (wearable IMU sensors, WiFi/radar sensing, skeleton-based recognition, medical/financial time series, and RL agents), perturbation strategies, temporal scope, and physical realizability levels. Our quantitative synthesis reveals severe baseline vulnerabilities—FGSM attacks degrade HAR accuracy from 95.1% to 3.4% under white-box conditions—while demonstrating that cross-sensor transferability varies dramatically from 0% to 80% depending on body placement and modality. Critically, we identify a substantial gap between digital attack success rates (85–98%) and physically validated attacks, with hardware-in-the-loop validation demonstrating 70–97% success only for WiFi and radar modalities, while wearable IMU physical attacks remain entirely unvalidated.We provide systematic analysis of defense mechanisms including adversarial training, detection-based approaches, certified defenses, and ensemble methods, proposing the Temporal AutoAttack (T-AutoAttack) framework for standardized adaptive attack evaluation. Our analysis reveals that current defenses exhibit 6–23% performance degradation under adaptive attacks, with certified methods showing the smallest gap but incurring 15–30% clean accuracy costs. We further identify emerging vulnerabilities in transformer-based HAR architectures and LLM-based time series forecasters that require urgent attention.The survey culminates in a prioritized research roadmap identifying eight critical gaps with specific datasets, evaluation pipelines, and implementation timelines. We provide actionable deployment recommendations for practitioners across wearable HAR, WiFi/radar sensing, RL systems, and emerging LLM-based temporal applications. This work offers the first unified treatment bridging time series and reinforcement learning adversarial research, establishing foundations for developing robust temporal AI systems suitable for real-world deployment in safety-critical domains.