Universal RL Approximation

Universal RL Approximation

AIXI is a theoretical, universally optimal and incomputable RL agent, proposed by Marcus Hutter and largely useful as a goal to approximate.

There are several implementations of approximations to AIXI, including MC-AIXI-CTW, a simple and computable approximation to AIXI. However, while the theory has advanced to ensemble models, the implementations have not.

Infotheory, an open-source Algorithmic Information Theory library, implements a large model class and ensembles thereof (including Bayesian, switching, and convex mixtures, plus more).

This allows exceeding the capability of Context-Tree Weighting while maintaining its theoretical properties in the worst case.

I also demonstrate that Infotheory’s MC-AIXI-CTW base is faster and more memory-efficient than competitors (PyAIXI and the reference C++ implementation).

RSS and Speed Scaling: PyAIXI vs Infotheory vs MC-AIXI-CPP

Instructions to reproduce this benchmark here.

Infotheory also compiles to WebAssembly, and I have created a Web Demo of MC-AIXI, where you can configure the models(including ensembles), agent parameters, and select an environment, and run it, and inspect what is going on.

I hope you can find this useful, as you can inherit the theoretical guarantees of MC-AIXI-CTW, while further improving performance and allowing integration into real use-cases.

This is particularly useful when you are dealing with an unknown but computable environment.

Any feedback or suggestions would be greatly welcomed.

submitted by /u/Financial_Mango713
[link] [comments]

Liked Liked