I am training RL agents in team pursuit (MAPPO) with only capture reward and time penalty…after the first 5000 training iterations the agents have only learnt to travel a little bit and camp…for any other effective strategies to occur do I need a harder training environment ?
submitted by /u/d13maxx
[link] [comments]
Like
0
Liked
Liked