Oak: A Python package for high performance RL in Pokemon RBY OU

digitado ⋅ 24 de April de 2026

I’ve written a program suite and python library that combines an ultra fast simulator with a small Stockfish style neural networks (with policy priors) to attack perfect-information search in the first generation of Pokemon battling.

The goal of this library is to train a network and optimize search hyper-parameters that together will serve as the evalation function for an Information-Set MCTS approach to the full game. It is simple, at this point in development, to swap the eval in Foul-Play – the strongest 6v6 Singles AI.

It includes the following programs:

generate

Self-play data generation that saves multiple value and policy targets in an efficient serialized format

vs

A tool for comparing two eval/search parameters in a head to head

chall

A CLI for analyzing arbitrary positions

battle

Train value/policy networks.

build

Train team-building networks

evo

Search hyper-parameter optimization using evolution

rl

Reinforcement learning using generate/battle/build simultaneously

I will answer questions in the comments. It’s all very fast and you can train a SOTA eval in a few hours on a laptop. It just needs users xd

submitted by /u/open_cover_dev
[link] [comments]

Like 0

Liked Liked