FADE: Selective Forgetting via Sparse LoRA and Self-Distillation
arXiv:2602.07058v1 Announce Type: new
Abstract: Machine Unlearning aims to remove the influence of specific data or concepts from trained models while preserving overall performance, a capability increasingly required by data protection regulations and responsible AI practices. Despite recent progress, unlearning in text-to-image diffusion models remains challenging due to high computational costs and the difficulty of balancing effective forgetting with retention of unrelated concepts. We introduce FADE (Fast Adapter for Data Erasure), a two-stage unlearning method for image generation that combines parameter localization with self-distillation. FADE first identifies parameters most responsible for the forget set using gradient-based saliency and constrains updates through sparse LoRA adapters, ensuring lightweight, localized modifications. In a second stage, FADE applies a self-distillation objective that overwrites the forgotten concept with a user-defined surrogate while preserving behavior on retained data. The resulting adapters are memory-efficient, reversible, and can be merged or removed at runtime, enabling flexible deployment in production systems. We evaluated FADE on the UnlearnCanvas benchmark and conducted ablation studies on Imagenette, Labeled Faces in the Wild, AtharvaTaras Dog Breeds Dataset, and SUN Attributes datasets, demonstrating State-of-the-Art unlearning performance with fine-grained control over the forgetting-retention trade-off. Our results demonstrate that FADE achieves strong concept erasure and high retainability across various domains, making it a suitable solution for selective unlearning in diffusion-based image generation models.