Confused about Model-Based RL
I’m trying to build a clear conceptual understanding of Model-Based Reinforcement Learning, but I’m getting confused because several ideas seem to overlap.
For example, I’ve encountered:
– Dyna-style methods: learning a model and generating synthetic (imagined) data to improve policy/value learning
– World models (e.g., Dreamer): learning latent dynamics and doing policy optimization in imagination
– Planning-based approaches such as MPC or Monte Carlo Tree Search: using the learned model to select actions via planning
What confuses me is how these relate to each other.
-
Is there a survey or resource that organizes model-based RL methods into a structured table?
-
What are the main directions in recent model-based RL research?
I would really appreciate any survey papers, conceptual overviews, or references that help clarify these distinctions.
submitted by /u/audi_etron
[link] [comments]