Fine-grained Analysis of Non-parametric Estimation for Pairwise Learning
arXiv:2305.19640v3 Announce Type: replace
Abstract: In this paper, we are concerned with the generalization performance of non-parametric estimation for pairwise learning. Most of the existing work requires the hypothesis space to be convex or a VC-class, and the loss to be convex. However, these restrictive assumptions limit the applicability of the results in studying many popular methods, especially kernel methods and neural networks. We significantly relax these restrictive assumptions and establish a sharp oracle inequality of the empirical minimizer with a general hypothesis space for the Lipschitz continuous pairwise losses. As an example, we apply our general results to study pairwise least squares regression and derive an excess population risk bound that matches the minimax lower bound for the pointwise least squares regression. The key novelty lies in constructing a structured deep ReLU neural network to approximate the true predictor, and in designing a targeted hypothesis space composed of networks with this structure and controllable complexity. Experiments validate the effectiveness of the proposed method. This example demonstrates that the obtained general results indeed help us to explore the generalization performance on a variety of problems that cannot be handled by existing approaches.