What do you think of Yann Lecun option of RL being the cherry on top of all the ML cake?

digitado ⋅ 5 de July de 2026

Title says it all. I’m not expert in pure RL research, I worked mainly in foundation models so far.

Im curious on earing form expert what are their opinion of the role of modern RL, in particular:

– will it be just the very last fine tuning layer of bigger foundation models? If so what kind of RL approach you think are most prominent?

– will there be (or there are alredy) model that use RL more as a core layer in the whole model?

My gut feeling is that RL is very cool, but the hype has gone down in the last years due to diffusion/foundation model performing and scaling much better, and a lot of RL is perceived in practice as mainly “reward engineering”.

Please correct me as I might be very wrong 🙂

submitted by /u/Amazing-Coat5160
[link] [comments]

Like 0

Liked Liked