Memorization to Generalization: Emergence of Diffusion Models from Associative Memory
arXiv:2505.21777v3 Announce Type: replace-cross
Abstract: Dense Associative Memories (DenseAMs) are generalizations of Hopfield networks, which have superior information storage capacity and can store training data points (memories) at local minima of the energy landscape. When the amount of training data exceeds the critical memory storage capacity of these models, new local minima, which are different from the training data, emerge. In Associative Memory these emergent local minima are called $textit{spurious}; textit{states}$, which hinder memory retrieval. In this work, we examine diffusion models (DMs) through the DenseAM lens, viewing their generative process as an attempt of a memory retrieval. In the small data regimes, DMs create distinct attractors for each training sample, akin to DenseAMs below the critical memory storage. As the training data size increases, they transition from memorization to generalization. We identify a critical intermediate phase, predicted by DenseAM theory — the spurious states. In generative modeling, these states are no longer negative artifacts but rather are the first signs of generative capabilities. We characterize the basins of attraction, energy landscape curvature, and computational properties of these previously overlooked states. Their existence is demonstrated across a wide range of architectures and datasets.