Learning Contraction Metrics for Provably Stable Model-Based Reinforcement Learning

Model-based reinforcement learning (MBRL) offers improved sample efficiency but faces instability from model errors and compounding uncertainties. We present Contraction Dynamics Model (CDM), a framework that learns state-dependent Riemannian contraction metrics jointly with system dynamics and control policies to ensure stability during training and deployment. The method uses a softplus-Cholesky decomposition for positive definite metric parameterization and optimizes via virtual displacements to minimize trajectory divergence energy. An adaptive stability regularizer incorporates the learned metric into policy objectives, guiding exploration toward contracting state space regions. Theoretically, we establish exponential trajectory convergence in expectation, derive robustness bounds against model errors, and characterize sample complexity. Empirically, on continuous control benchmarks (Pendulum, CartPole, HalfCheetah), contraction-guided learning enhances stability, sample efficiency (38.9% step reduction), and resilience to model errors (78% performance retention vs 52% for baselines at 10% noise) compared to MBRL baselines (PETS, MBPO) and safe RL methods. Ablation studies confirm design choices, showing learned metrics yield 10-40% performance gains with 20% computational overhead. This work demonstrates that learning contraction metrics enables practical, scalable embedding of nonlinear stability guarantees in deep reinforcement learning.

Liked Liked