Universal RL Approximation
|
AIXI is a theoretical, universally optimal and incomputable RL agent, proposed by Marcus Hutter and largely useful as a goal to approximate. There are several implementations of approximations to AIXI, including MC-AIXI-CTW, a simple and computable approximation to AIXI. However, while the theory has advanced to ensemble models, the implementations have not. Infotheory, an open-source Algorithmic Information Theory library, implements a large model class and ensembles thereof (including Bayesian, switching, and convex mixtures, plus more). This allows exceeding the capability of Context-Tree Weighting while maintaining its theoretical properties in the worst case. I also demonstrate that Infotheory’s MC-AIXI-CTW base is faster and more memory-efficient than competitors (PyAIXI and the reference C++ implementation). RSS and Speed Scaling: PyAIXI vs Infotheory vs MC-AIXI-CPP Instructions to reproduce this benchmark here. Infotheory also compiles to WebAssembly, and I have created a Web Demo of MC-AIXI, where you can configure the models(including ensembles), agent parameters, and select an environment, and run it, and inspect what is going on. I hope you can find this useful, as you can inherit the theoretical guarantees of MC-AIXI-CTW, while further improving performance and allowing integration into real use-cases. This is particularly useful when you are dealing with an unknown but computable environment. Any feedback or suggestions would be greatly welcomed. submitted by /u/Financial_Mango713 |