technocracy LoRA in RL can match full-finetuning performance when done right – by Thinking Machines digitado ⋅ 27 de June de 2026 submitted by /u/ranfirar [link] [comments] Like 0 Liked Liked → « MathFormer: Testing whether symbolic math is pattern matching or reasoning [D] » I built Reinforcement Learning Map