Hybrid MARL + Linear Programming Architecture for Dynamic Vehicle Routing (Zero-Shot Generalization)
Hi everyone, I wanted to share the architecture of a 2-year project I led: optimizing a line-haul logistics network using a hybrid of Multi-Agent RL (MARL) and Linear Programming (LP). We were trying to optimize a live and complex delivery network with dynamically arriving requests. We built a hierarchical architecture to get the best of both worlds (standard OR and RL): The “Fleet Manager” (MARL): PPO agents handle the high-level decision-making. The agent decides which cluster of orders […]