try Symphony (1env) in responce to Samas69420 (Proximal Policy Optimization with 512 envs)

digitado ⋅ 1 de January de 2026

I was scrolling different topics and found you were trying to train OpenAI’s Humanoid.

Symphony is trained without paralell simulations, model-free, no behavioral cloning.

It is 5 years of work understanding humans. It does not go for speed, but it runs well before 8k episodes.

paper: https://arxiv.org/abs/2512.10477 (it might feel more like book than short paper)

Like 0

Liked Liked