Open-source RL environments: 13 puzzle games (1,872 levels) for training interactive abstract reasoning agents

I’ve been building RL training environments for the upcoming ARC-AGI-3 competition — 13 games, 1,872 levels — and wanted to share them with the community.

The environments are inspired by The Witness) — each game teaches a different abstract rule (path constraints, region partitioning, symmetry, etc.) through progressive difficulty with zero instructions.

RL-specific details:

OpenEnv compatible (Gymnasium-style API)

– 3 reward modes: sparse (task completion only), shaped (step-level heuristics), arc_score (official ARC metric)

– Teaching mode: annotate reasoning & solving trajectories — useful for imitation learning or building process reward models

– 959 levels have solver-verified optimal solutions as baselines

The key challenge: agents must discover both the rules AND goals through interaction alone — no instructions provided. This makes reward shaping particularly interesting since shaped rewards can leak information about the rules.

GitHub: github.com/Guanghan/arc-witness-envs

Curious how different RL approaches would handle this — especially since the agent has to infer the goal from scratch in sparse reward mode. Has anyone tried curriculum strategies for environments where even the task objective is unknown?

submitted by /u/smallgok
[link] [comments]

Liked Liked