Open weights are not enough: we need open training frameworks for research and better algorithms [P]
Open weights are important and critical, but they are not enough by themselves.
If we want open ML and AI research to move forward, we also need open training frameworks: codebases that do more than run jobs. They should make the training process visible, understandable, and modifiable, so researchers/engineers/practitioner can build new algorithms instead of fighting hidden systems.
That was the motivation behind FeynRL (pronounced “FineRL”) a framework I built for RL post-training of LLMs, VLMs, and agents. RL is already hard to make work. With LLMs, VLM, and agents, it becomes even messier: rollout engines, reward computation, distributed training, weight syncing, credit assignment problems, long-horizon behavior, and many small implementation details that can quietly break everything.
The core idea behind FeynRL is simple: algorithms should stay algorithms, systems should stay systems, and researchers/engineers/practitioner should be able to understand the full training loop end-to-end without spending days or weeks.
GitHub: https://github.com/FeynRL-project/FeynRL
The framework is designed to keep the framework explicit: from data loading and rollout generation to reward computation, loss construction, optimization, and evaluation. The goal is to make it easier to develop new algorithms, training recipes, reward designs, rollout strategies, and optimization methods without going through a convoluted hidden system.
The framework currently includes examples for SFT, DPO, and RL-style post-training for both vllm and llm, with support for single-GPU, multi-GPU, and cluster setups.
Would love feedback, issues, suggestions. Also, curious to hear what parts of RL post-training infrastructure people still find too hidden, hard to debug, or hard to modify.
submitted by /u/summerday10
[link] [comments]