Manifold Generalization Provably Proceeds Memorization in Diffusion Models

arXiv:2603.23792v1 Announce Type: cross
Abstract: Diffusion models often generate novel samples even when the learned score is only emph{coarse} — a phenomenon not accounted for by the standard view of diffusion training as density estimation. In this paper, we show that, under the emph{manifold hypothesis}, this behavior can instead be explained by coarse scores capturing the emph{geometry} of the data while discarding the fine-scale distributional structure of the population measure~$mu_{scriptscriptstylemathrm{data}}$. Concretely, whereas estimating the full data distribution $mu_{scriptscriptstylemathrm{data}}$ supported on a $k$-dimensional manifold is known to require the classical minimax rate $tilde{mathcal{O}}(N^{-1/k})$, we prove that diffusion models trained with coarse scores can exploit the emph{regularity of the manifold support} and attain a near-parametric rate toward a emph{different} target distribution. This target distribution has density uniformly comparable to that of~$mu_{scriptscriptstylemathrm{data}}$ throughout any $tilde{mathcal{O}}bigl(N^{-beta/(4k)}bigr)$-neighborhood of the manifold, where $beta$ denotes the manifold regularity. Our guarantees therefore depend only on the smoothness of the underlying support, and are especially favorable when the data density itself is irregular, for instance non-differentiable. In particular, when the manifold is sufficiently smooth, we obtain that emph{generalization} — formalized as the ability to generate novel, high-fidelity samples — occurs at a statistical rate strictly faster than that required to estimate the full population distribution~$mu_{scriptscriptstylemathrm{data}}$.

Liked Liked