Curiosity-Driven Exploration with Information Bottleneck Representations and Matrix-Based Mutual Information
Exploration remains a central challenge in reinforcement learning, especially in sparse-reward settings where extrinsic feedback alone is often insufficient to guide effective behavior. In this work, we develop a curiosity-driven framework that combines a hybrid intrinsic reward with compact predictive representation learning. Specifically, curiosity is quantified by integrating prediction error with the rarity of state-action pairs in a learned latent space. To make novelty estimation more meaningful for high-dimensional observations such as raw pixels, we employ the Information Bottleneck principle to learn low-dimensional representations that suppress irrelevant variability while preserving predictive structure of the environment dynamics. We further investigate two practical ways to optimize predictive information: one based on entropy decomposition and the other based on matrix-based Renyi entropy. Experiments on Acrobot show that the proposed method substantially improves exploration efficiency over ICM, RND, and a $k$-NN novelty baseline. On MountainCar, however, the improvement is less evident, suggesting that the proposed framework is particularly beneficial in environments with high-dimensional observations or more structured dynamics.