I implemented DQN, PPO and A3C from scratch in pure PowerShell 5.1 — no Python, no dependencies

digitado ⋅ 5 de March de 2026

Bit of an unusual one — I built a complete RL framework in PowerShell 5.1.

The motivation was accessibility. Most IT professionals work in PowerShell daily but have no path into RL. Existing frameworks (PyTorch, TensorFlow) are excellent but assume Python familiarity and hide the algorithmic details behind abstractions.

VBAF exposes everything — every weight update, every Q-value, every policy gradient step — in readable scripting code. It’s designed to make RL understandable, not just usable.

What’s implemented:

Q-Learning with experience replay
DQN with replay buffer
PPO (Proximal Policy Optimization)
A3C (Asynchronous Advantage Actor-Critic)
Multi-agent market simulation with emergent behaviors
Standardized environments: CartPole, GridWorld, RandomWalk

Not competing with PyTorch — this is a teaching tool for people who want to see exactly how the algorithms work before trusting a black box.

GitHub: https://github.com/JupyterPS/VBAF Install: Install-Module VBAF -Scope CurrentUser

Curious what the RL community thinks!

submitted by /u/No_Set1131
[link] [comments]

Like 0

Liked Liked