ReHyAt: Recurrent Hybrid Attention for Video Diffusion Transformers
arXiv:2601.04342v1 Announce Type: new Abstract: Recent advances in video diffusion models have shifted towards transformer-based architectures, achieving state-of-the-art video generation but at the cost of quadratic attention complexity, which severely limits scalability for longer sequences. We introduce ReHyAt, a Recurrent Hybrid Attention mechanism that combines the fidelity of softmax attention with the efficiency of linear attention, enabling chunk-wise recurrent reformulation and constant memory usage. Unlike the concurrent linear-only SANA Video, ReHyAt’s hybrid design allows efficient distillation from existing […]