Robust Exploratory Stopping under Ambiguity in Reinforcement Learning
arXiv:2510.10260v2 Announce Type: replace-cross Abstract: We propose and analyze a continuous-time robust reinforcement learning framework for optimal stopping under ambiguity. In this framework, an agent chooses a robust exploratory stopping time motivated by two objectives: robust decision-making under ambiguity and learning about the unknown environment. Here, ambiguity refers to considering multiple probability measures dominated by a reference measure, reflecting the agent’s awareness that the reference measure representing her learned belief about the environment would be erroneous. Using the […]