Leave-One-Out Prediction for General Hypothesis Classes

arXiv:2603.02043v1 Announce Type: cross
Abstract: Leave-one-out (LOO) prediction provides a principled, data-dependent measure of generalization, yet guarantees in fully transductive settings remain poorly understood beyond specialized models. We introduce Median of Level-Set Aggregation (MLSA), a general aggregation procedure based on empirical-risk level sets around the ERM. For arbitrary fixed datasets and losses satisfying a mild monotonicity condition, we establish a multiplicative oracle inequality for the LOO error of the form [ LOO_S(hat{h}) ;le; C cdot frac{1}{n} min_{hin H} L_S(h) ;+; frac{Comp(S,H,ell)}{n}, qquad C>1. ]
The analysis is based on a local level-set growth condition controlling how the set of near-optimal empirical-risk minimizers expands as the tolerance increases. We verify this condition in several canonical settings. For classification with VC classes under the 0-1 loss, the resulting complexity scales as $O(d log n)$, where $d$ is the VC dimension. For finite hypothesis and density classes under bounded or log loss, it scales as $O(log |H|)$ and $O(log |P|)$, respectively. For logistic regression with bounded covariates and parameters, a volumetric argument based on the empirical covariance matrix yields complexity scaling as $O(d log n)$ up to problem-dependent factors.

Liked Liked