Deep Ensembles for Epistemic Uncertainty: A Frequentist Perspective

arXiv:2510.22063v2 Announce Type: replace
Abstract: Decomposing prediction uncertainty into aleatoric (irreducible) and epistemic (reducible) components is critical for the reliable deployment of machine learning systems. While the mutual information between the response variable and model parameters is a principled measure for epistemic uncertainty, it requires access to the parameter posterior, which is computationally challenging to approximate. Consequently, practitioners often rely on probabilistic predictions from deep ensembles to quantify uncertainty, which have demonstrated strong empirical performance. However, a theoretical understanding of their success from a frequentist perspective remains limited. We address this gap by first considering a bootstrap-based estimator for epistemic uncertainty, which we prove is asymptotically correct. Next, we connect deep ensembles to the bootstrap estimator by decomposing it into data variability and training stochasticity; specifically, we show that deep ensembles capture the training stochasticity component. Through empirical studies, we show that this stochasticity component constitutes the majority of epistemic uncertainty, thereby explaining the effectiveness of deep ensembles.

Liked Liked