RL envs for LLMs seem to be a bigger deal than I thought
Im super late to the party but I had completely forgotten about RL environments for agents since I first explored it due to mechanize.work. Now I see that meta has also been quite active in that area.
I’m a mere layman in RL so could someone tell me how big of a deal this actually is and if it is an unsolved problem or just an implementation/business problem?
submitted by /u/lokeye-ai
[link] [comments]
Like
0
Liked
Liked