Quantitative convergence of trained single layer neural networks to Gaussian processes

arXiv:2509.24544v3 Announce Type: replace
Abstract: In this paper, we study the quantitative convergence of shallow neural networks trained via gradient descent to their associated Gaussian processes in the infinite-width limit.
While previous work has established qualitative convergence under broad settings, precise, finite-width estimates remain limited, particularly during training.
We provide explicit upper bounds on the quadratic Wasserstein distance between the network output and its Gaussian approximation at any training time $t ge 0$, demonstrating polynomial decay with network width.
Our results quantify how architectural parameters, such as width and input dimension, influence convergence, and how training dynamics affect the approximation error.

Liked Liked