Trainer For MARL That Fits With PettingZoo
|
After 9 months of work I finally got my first successful run in a simple RL environment where the agent learns to find a target 🎉 I’m still validating more SARL scenarios, but I’m now thinking ahead toward MARL and wanted some advice on architecture and trainer choice. Current RL engine structure:
I also have a Gymnasium wrapper: env = GymWrapper(simulation_engine) which exposes clean reset() and step() APIs for SB3. The thing is: internally SimulationEngine already works with dictionary-based outputs: { “agent_1”: observation, “agent_2”: observation } For SARL + Gymnasium I transform this into something meaningful for SB3. But from what I understand, PettingZoo naturally expects agent-keyed dictionaries, which makes me think my current architecture could fit MARL pretty neatly without major redesign. My main concern is the trainer side. SB3 + Gymnasium has been incredibly straightforward and I already have experience with it. But for: PettingZoo + ??? I’m stuck. Initially I was considering RLlib because it seems to be the common answer, but I honestly don’t have the time/energy for a steep learning curve if there are cleaner alternatives. I’m mainly interested in MAPPO and similar MARL algorithms. Questions:
Any suggestions or experiences would be really appreciated. submitted by /u/Public-Journalist820 |