Hierarchical Integration of Large Language Models and Multi-Agent Reinforcement Learning
This study presents L2M2, a hierarchical framework in which LLMs generate high-level strategies while MARL agents execute low-level control policies. The architecture targets long-horizon coordination problems by decomposing decision-making across temporal scales. Evaluation on navigation and resource allocation tasks totaling 8,200 episodes shows that L2M2 improves task success rates by 20.5% and reduces convergence time by 1.6× compared to flat MARL approaches.
Like
0
Liked
Liked