lightweight, modular RL post-training framework for large models

lightweight, modular RL post-training framework for large models submitted by /u/summerday10
[link] [comments]
Liked Liked