A 196B Model That Runs Like 11B: The Step 3.5 Flash Bet

Step 3.5 Flash shows how to get frontier reasoning without frontier bills—sparse experts, smarter routing, and MIS-PO RL that keeps training stable.

Liked Liked