Adaptive Reinforcement Learning Offloading: Unifying Federated Dissimilarity Measures and Generalizable Multi-Objective Optimization for Mobile Edge Computing
Reinforcement learning (RL) in mobile edge computing (MEC) faces critical challenges of data heterogeneity, communication overhead, and limited generalization across diverse preferences and system configurations. We propose Adaptive Reinforcement Learning Offloading (ARLO), a unified framework integrating adaptive dissimilarity measures for federated learning with generalizable multi-objective optimization for computation offloading. The Adaptive Dissimilarity Measure module leverages parameter dissimilarity with Lagrangian multipliers to mitigate model drift under Non-IID data and loss dissimilarity to reduce communication overhead via adaptive aggregation. The Contextual Multi-Objective Decision module employs histogram-based state encoding and a Generalizable Neural Network Architecture with action masking, enabling a single policy to adapt to varying preferences, server counts, and CPU frequencies. Experiments show ARLO achieves 82.6% accuracy on CIFAR-10 with 44.3% fewer communication rounds than FedProx, and a 121.0% hypervolume improvement in offloading with only 1.7% generalization error across unseen configurations.