Federated Joint Learning for Domain and Class Generalization
arXiv:2601.12253v1 Announce Type: new
Abstract: Efficient fine-tuning of visual-language models like CLIP has become crucial due to their large-scale parameter size and extensive pretraining requirements. Existing methods typically address either the issue of unseen classes or unseen domains in isolation, without considering a joint framework for both. In this paper, we propose textbf{Fed}erated Joint Learning for textbf{D}omain and textbf{C}lass textbf{G}eneralization, termed textbf{FedDCG}, a novel approach that addresses both class and domain generalization in federated learning settings. Our method introduces a domain grouping strategy where class-generalized networks are trained within each group to prevent decision boundary confusion. During inference, we aggregate class-generalized results based on domain similarity, effectively integrating knowledge from both class and domain generalization. Specifically, a learnable network is employed to enhance class generalization capabilities, and a decoupling mechanism separates general and domain-specific knowledge, improving generalization to unseen domains. Extensive experiments across various datasets show that textbf{FedDCG} outperforms state-of-the-art baselines in terms of accuracy and robustness.