Oak: A Python package for high performance RL in Pokemon RBY OU

Tutorial (WIP)

I’ve written a program suite and python library that combines an ultra fast simulator with a small Stockfish style neural networks (with policy priors) to attack perfect-information search in the first generation of Pokemon battling.

The goal of this library is to train a network and optimize search hyper-parameters that together will serve as the evalation function for an Information-Set MCTS approach to the full game. It is simple, at this point in development, to swap the eval in Foul-Play – the strongest 6v6 Singles AI.

It includes the following programs:

  • generate

Self-play data generation that saves multiple value and policy targets in an efficient serialized format

  • vs

A tool for comparing two eval/search parameters in a head to head

  • chall

A CLI for analyzing arbitrary positions

  • battle

Train value/policy networks.

  • build

Train team-building networks

  • evo

Search hyper-parameter optimization using evolution

  • rl

Reinforcement learning using generate/battle/build simultaneously

I will answer questions in the comments. It’s all very fast and you can train a SOTA eval in a few hours on a laptop. It just needs users xd

submitted by /u/open_cover_dev
[link] [comments]

Liked Liked