Entropy Estimation in Cluster Graphs
Probabilistic graphical models (PGMs) provide a powerful framework for modelling complex systems, but inference over loopy graphs requires approximate methods whose accuracy depends on how factors are clustered in the graphical representation. Existing factor clustering methods rely on the number of variables in a cluster as a proxy for memory cost and informational content—a loose upper bound that leads to suboptimal merging decisions. We address this limitation by proposing an efficient algorithm for estimating the joint entropy of a group of clusters without explicitly multiplying out the constituent factors, thereby avoiding the exponential computational cost that makes exact computation prohibitive. The algorithm integrates naturally with both static and dynamic graph restructuring methods, and reduces to the Kikuchi entropy approximation when applied to the complete graph. Experiments on models with up to 24 variables demonstrate that the algorithm produces accurate (when compared to ideal junction tree performance) entropy estimates across diverse model types, with errors remaining within tight bounds. Scalability is further validated on a substantially larger model defined over 2640 random variables. These results confirm that accurate entropy estimation is achievable wherever reliable probabilistic inference is possible, and that the proposed estimation algorithm yields objectively close approximations, thereby supporting improved clustering decisions in PGM structuring algorithms.