Random Matrix Theory of Early-Stopped Gradient Flow: A Transient BBP Scenario
arXiv:2604.18450v1 Announce Type: new Abstract: Empirical studies of trained models often report a transient regime in which signal is detectable in a finite gradient descent time window before overfitting dominates. We provide an analytically tractable random-matrix model that reproduces this phenomenon for gradient flow in a linear teacher–student setting. In this framework, learning occurs when an isolated eigenvalue separates from a noisy bulk, before eventually disappearing in the overfitting regime. The key ingredient is anisotropy in the input […]