LoRA in RL can match full-finetuning performance when done right – by Thinking Machines

submitted by /u/ranfirar
[link] [comments]

Liked Liked