I implemented Tabular Q – Learning from scratch

digitado ⋅ 27 de June de 2026

Check out the GitHub repo. This is a from scratch implementation of Tabular, 1-step, Q – Learning, with the environment built from Pygame. The above GIF demonstrates the agent exploring/exploiting the environment (left) based on the epsilon value to maximize it’s reward signal, and the Q – function (right) displaying what the agent thinks is the action per state that yields the highest reward for that state.

submitted by /u/compugineer44
[link] [comments]

Like 0

Liked Liked