I implemented Tabular Q – Learning from scratch
|
Check out the GitHub repo. This is a from scratch implementation of Tabular, 1-step, Q – Learning, with the environment built from Pygame. The above GIF demonstrates the agent exploring/exploiting the environment (left) based on the epsilon value to maximize it’s reward signal, and the Q – function (right) displaying what the agent thinks is the action per state that yields the highest reward for that state. submitted by /u/compugineer44 |
Like
0
Liked
Liked