LoRA in RL can match full-finetuning performance when done right – by Thinking Machines

digitado ⋅ 27 de June de 2026

Like 0

Liked Liked

Search

Posts recentes

Benchmarking Self-Hosted Gemma 2 9B vs. Frontier APIs: The FP8 Quantization Prefill Tax and VRAM Realities on an NVIDIA L4 [P]
SoftBank’s CEO isn’t the only one with questions about Elon Musk’s orbital data center hype
I implemented Tabular Q – Learning from scratch
I built Reinforcement Learning Map
LoRA in RL can match full-finetuning performance when done right – by Thinking Machines

No comments to show.