Skip to Main content Skip to Navigation

Designing and analyzing new early stopping rules for saving computational resources

yaroslav Averyanov 1 
1 MODAL - MOdel for Data Analysis and Learning
LPP - Laboratoire Paul Painlevé - UMR 8524, Université de Lille, Sciences et Technologies, Inria Lille - Nord Europe, METRICS - Evaluation des technologies de santé et des pratiques médicales - ULR 2694, Polytech Lille - École polytechnique universitaire de Lille
Abstract : This work develops and analyzes strategies for constructing instances of the so-called early stopping rules applied to some iterative learning algorithms for estimating the regression function. Such quantities are data-driven rules indicating when to stop the iterative learning process to reach a trade-off between computational costs and the statistical precision. Unlike a large part of the existing literature on early stopping, where these rules only depend on the data in a "weak manner", we provide data-driven solutions for the aforementioned problem without utilizing validation data. The crucial idea exploited here is that of the minimum discrepancy principle (MDP), which shows when to stop an iterative learning algorithm. To the best of our knowledge, this idea dates back to the work of Vladimir A. Morozov in the 1960s-1970s who studied linear ill-posed problems and their regularization, mostly inspired by mathematical physics problems. Among different applications of this line of work, the so-called spectral filter estimators such as spectral cut-off, Landweber iterations, and Tikhonov (ridge) regularization have received quite a lot of attention (e.g., in statistical inverse problems). It is worth mentioning that the minimum discrepancy principle consists in controlling the residuals of an estimator (which are iteratively minimized) and properly setting a threshold for them such that one can achieve some (minimax) optimality. The first part of this thesis is dedicated to theoretical guarantees of stopping rules based on the minimum discrepancy principle and applied to gradient descent, and Tikhonov (ridge) regression in the framework of reproducing kernel Hilbert space (RKHS). There, we show that this principle provides a minimax optimal functional estimator of the regression function when the rank of the kernel is finite. However, when one deals with infinite-rank reproducing kernels, the resulting estimator will be only suboptimal. While looking for a solution, we found the existence of the so-called residuals polynomial smoothing strategy. This strategy (combined with MDP) has been proved to be optimal for the spectral cut-off estimator in the linear Gaussian sequence model. We borrow this strategy, modify the stopping rule accordingly, and prove that the smoothed minimum discrepancy principle yields a minimax optimal functional estimator over a range of function spaces, which includes the well-known Sobolev function class. Our second contribution consists in exploring the theoretical properties of the minimum discrepancy stopping rule applied to the more general family of linear estimators. The main difficulty of this approach is that, unlike the spectral filter estimators considered earlier, linear estimators do no longer lead to monotonic quantities (the bias and variance terms). Let us mention that this is also the case for famous algorithms such as Stochastic Gradient Descent. Motivated by further practical applications, we work with the widely used k-NN regression estimator as a reliable first example. We prove that the aforementioned stopping rule leads to a minimax optimal functional estimator, in particular, over the class of Lipschitz functions on a bounded domain. The third contribution consists in illustrating through empirical experiments that for choosing the tuning parameter in a linear estimator (the k-NN regression, Nadaraya-Watson, and variable selection estimators), the MDP-based early stopping rule performs comparably well with respect to other widely used and known model selection criteria.
Document type :
Complete list of metadata
Contributor : Yaroslav Averyanov Connect in order to contact the contributor
Submitted on : Saturday, February 6, 2021 - 1:06:55 AM
Last modification on : Wednesday, March 23, 2022 - 3:51:09 PM
Long-term archiving on: : Friday, May 7, 2021 - 6:01:59 PM


averyanov_phd_thesis (1).pdf
Files produced by the author(s)


  • HAL Id : tel-03133391, version 1



yaroslav Averyanov. Designing and analyzing new early stopping rules for saving computational resources. Statistics [math.ST]. Université de Lille; Inria, 2020. English. ⟨tel-03133391⟩



Record views


Files downloads