Generalization Properties of Score-matching Diffusion Models for Intrinsically Low-dimensional Data
arXiv:2603.03700v1 Announce Type: new
Abstract: Despite the remarkable empirical success of score-based diffusion models, their statistical guarantees remain underdeveloped. Existing analyses often provide pessimistic convergence rates that do not reflect the intrinsic low-dimensional structure common in real data, such as that arising in natural images. In this work, we study the statistical convergence of score-based diffusion models for learning an unknown distribution $mu$ from finitely many samples. Under mild regularity conditions on the forward diffusion process and the data distribution, we derive finite-sample error bounds on the learned generative distribution, measured in the Wasserstein-$p$ distance. Unlike prior results, our guarantees hold for all $p ge 1$ and require only a finite-moment assumption on $mu$, without compact-support, manifold, or smooth-density conditions. Specifically, given $n$ i.i.d. samples from $mu$ with finite $q$-th moment and appropriately chosen network architectures, hyperparameters, and discretization schemes, we show that the expected Wasserstein-$p$ error between the learned distribution $hat{mu}$ and $mu$ scales as $mathbb{E}, mathbb{W}_p(hat{mu},mu) = widetilde{O}!left(n^{-1 / d^ast_{p,q}(mu)}right),$ where $d^ast_{p,q}(mu)$ is the $(p,q)$-Wasserstein dimension of $mu$. Our results demonstrate that diffusion models naturally adapt to the intrinsic geometry of data and mitigate the curse of dimensionality, since the convergence rate depends on $d^ast_{p,q}(mu)$ rather than the ambient dimension. Moreover, our theory conceptually bridges the analysis of diffusion models with that of GANs and the sharp minimax rates established in optimal transport. The proposed $(p,q)$-Wasserstein dimension also extends classical Wasserstein dimension notions to distributions with unbounded support, which may be of independent theoretical interest.