ELROND: Exploring and decomposing intrinsic capabilities of diffusion models
arXiv:2602.10216v1 Announce Type: new Abstract: A single text prompt passed to a diffusion model often yields a wide range of visual outputs determined solely by stochastic process, leaving users with no direct control over which specific semantic variations appear in the image. While existing unsupervised methods attempt to analyze these variations via output features, they omit the underlying generative process. In this work, we propose a framework to disentangle these semantic directions directly within the input embedding space. […]