On Early Stopping in Gradient Descent Boosting

Yuan Yao; Lorenzo Rosasco; Andrea Caponnetto. 27 June, 2005.
Communicated by Partha Niyogi.


In this paper, we study a family of gradient descent algorithms to approximate the regression function from reproducing kernel hilbert spaces. Here early stopping plays a role of regularization, where given a finite sample and a regularity condition on the regression function, a stopping rule is given and some probabilistic upper bounds are obtained for the distance between the function iterated at the stopping time and the regression function.A crucial advantage over other regularized least squares algorithms studied recently lies in that it breaks through the saturation phenonmenon where the convergence rate no longer improves after a certain rate of regularity achieved by the regression function. These upper bounds show that in some situations we can achieve optimal convergence rates. We also discuss the implication of these results in the context of classification. Some connections are addressed with the Landweber iteration for regularization in inverse problems and the online learning algorithms as stochastic approximations of gradient descent method.

Original Document

The original document is available in PDF (uploaded 27 June, 2005 by Partha Niyogi).

Additional Document Formats

The document is also available in Postscript (uploaded 27 June, 2005 by Partha Niyogi).

NOTE: The author warrants that these additional documents are identical with the originial to the extent permitted by the translation between the various formats. However, the webmaster has made no effort to verify this claim. If the authenticity of the document is an issue, please always refer to the "Original document." If you find significant alterations, please report to