How do I improve model performance?

I am training TD3 on MetaDrive with 10 scenes.

First, I trained on all 10 scenes together for 100k total steps (standard setup, num_scenarios=10, one learn call). Performance was very poor.

Then I trained 10 scenes sequentially with 100k per scene (scene 0 → 100k, then scene 1 → 100k, …). Total 1M steps. Still poor.

Then I selected a subset of scenes: [0, 1, 3, 6, 7, 8]. Then I selected a subset of scenes: [0, 1, 3, 6, 7, 8]. In an earlier experiment using the same script trained on all 10 scenes for 100k total steps, the model performed well mainly on these scenes, while performance on the others was consistently poor, so I focused on the more stable ones for further experiments.

Experiments on selected scenes:

100k per scene sequential

Example: scene 0 → 100k, then scene 1 → 100k, … until scene 8.

Model keeps learning continuously without reset.

Result: Very good performance.

200k per scene sequential

Example: scene 0 → 200k, scene 1 → 200k, …

Result: Performance degraded, some scenes get stuck.

300k per scene sequential

Same pattern, 300k each.

Result: Even worse generalization, unstable behavior.

Chatgpt advised me to try batch-wise / interleaved training.

So instead of training scene 0 fully, I trained in chunks (e.g., 5k on scene 0 → 5k on scene 1 → … rotate and repeat until each scene reaches total target steps).

Batch-wise training performed poorly as well.

My question:

What is the standard practice for multi-scene training in RL (TD3) if I want to improve the performance of the model?

submitted by /u/spyninj
[link] [comments]

Liked Liked