PPO rewards start crashing after some point on training
Hi, I was trying to implement PPO with Pytorch to solve Pendulum-v1 enviroment. There’s no problem at beginning of the train but after some point, rewards start crashing. I tried to figure out why its crashing. But I still haven’t figured it out. The repo I’m working on right now there’s only basic things like model implementation, training and utils. Can someone please help me if they know why this is happening?
Repo link: https://github.com/Gradient-Descent-is-Awesome/RL-Testing
submitted by /u/YahudiKundakcisi
[link] [comments]
Like
0
Liked
Liked