Best practices to understand what your model is doing

digitado ⋅ 30 de May de 2026

I use Tensorboard to plot my reward function, and all of my individual and aggregated losses. Depending on the model I also plot certain parameters, such as an entropy or discount term.

I have no other formal way of evaluating RL. I’d highly appreciate sharing best practices with others. What do you typically use? I heard some people register the <S, A, R, S’> but I do not see how this can be valuable in highly complex environments and long training cycles. imo it should be easier to observe a trend from a plot.

What else do you do or like doing?

submitted by /u/Markovvy
[link] [comments]

Like 0

Liked Liked