Use Cases for First/Every visit Monte Carlo
while I understand the difference between first visit Monte Carlo and every visit, are there any particular cases where we’d strongly prefer first visit and vice versa?
like from my understanding, there are situations where first and every visit can be identical, but some scenarios where every visit is much better( eg. blackjack where there are barely any chances for states to repeat versus scenarios like automated car driving, where episodes are scarce so it becomes valuable to extract as much data as possible)
I’m still torn up between whether a maze is ideal for first/every visit. Intuitively it seems like it should be every visit, as i would want to know if a certain state is cyclic, but if the same state can also lead to the terminal state, wouldn’t first visit be better?
My understanding might be wrong, please feel free to correct me where I’m wrong
submitted by /u/prithiv_
[link] [comments]