Reward shaping: How do you determine if your rewards are the right size and in the right proportions?
I am currently working on an RL game where an agent has to complete several (intermediate) jobs. The environment, jobs and agent features are very rich. For almost every single action I provide a progressive reward if it shows favorable behavior (e.g. a certain sequence of jobs, timing etc.) or a negative reward to penalize undesired behavior (e.g. delays). However, I have no feel for what the right size or number is for the rewards. And I also don’t know if I have to take into account proportionality among all types of rewards. Currently my sparse rewards are relatively small, and a big bonus reward is provided upon completing the end goal.
Curious how you are going about it in your work, and if you could possible recommend some resources to learn more about this. Thank you.
submitted by /u/Markovvy
[link] [comments]