Built a visual RL playground for my FYP (capability-based + graph reward design) looking for testers?

digitado ⋅ 1 de May de 2026

Hey guys,

I’m building a reinforcement learning playground as part of my final year project (FYP), mainly aimed at helping students/teachers learn RL visually, and I’d love to get feedback.

Core ideas:

🔹 Capability System (MOVEABLE, FINDER, NAVIGATOR, etc.)

Agents are composed from capabilities instead of hardcoded environments.

Each capability defines:

• Action space • Observations (OBS space) • State contributions

This makes environments modular and easier to reason about.

🔹 Visual Reward Design (Graph-based)

Reward functions are built as graphs:

• Conditional nodes (distance checks, radius, etc.) • Logical flow • Rewards / penalties / termination

No code, everything is visual.

🔹 Assignment Panel (Agent ↔ Graph ↔ Algo)

• Bind one or more agents to a behavior graph • Configure training (PPO supported) • Shared policy works naturally at inference, spawning agents with the same capabilities reuses the learned policy

🔹 Tech Stack / Architecture

• Frontend: Three.js + Rapier.js • Training: PyBullet + Gym + Stable-Baselines3 (PPO) • Inference: Remote PPO controller via WebSocket • Also includes a client-side tabular Q-learning option (more for learning/demo, limited scalability)

🔹 LLM-Assisted Workflow

• Suggests reward function improvements while designing • Explains trained model behavior + parameters during analysis

🔹 What’s next

• Proper multi-agent support (currently structuring toward it)

Where I need help / feedback:

One thing I’m still figuring out properly is:

👉 How to define good observation spaces (OBS) for different capabilities in a way that’s both generalizable and intuitive.

Would love input on that specifically.

If this looks interesting, I’d be happy to share access for testing. Also open to any feedback / criticism especially around abstractions and usability.

Thanks 🙏

submitted by /u/Public-Journalist820
[link] [comments]

Like 0

Liked Liked