Active Set Algorithms for the LASSO

Manuel Loth 1, 2
2 SEQUEL - Sequential Learning
LIFL - Laboratoire d'Informatique Fondamentale de Lille, LAGIS - Laboratoire d'Automatique, Génie Informatique et Signal, Inria Lille - Nord Europe
Abstract : This thesis disserts on the computation of the Least Absolute Shrinkage and Selection Operator (LASSO) and derivate problems, in regression analysis. This operator has drawn increasing attention since its introduction by Robert Tibshirani in 1996, for its ability to provide or recover sparse linear models from noisy observations, sparsity meaning that only a few of possibly many explaining variables are selected to appear in the model. The selection is a result of adding to the least-squares method a constraint or minimization on the sum of absolute values of the linear coeffcients, otherwise called the l1 norm of the coefficient vector. After recounting the motivations, principles and problematics of regression analysis, linear estimators, least-squares minimization, model selection, and regularization, the two equivalent formulations of the LASSO constrained or regularized are presented, that both define a non-trivial computation problem to associate an estimator to a set of observations and a selection parameter. A brief history of algorithms for solving these problems is given, as well as the two possible approaches for handling the non differentiability of the l1 norm, and the equivalence to a quadratic program is explained. The second part focuses on practical algorithms for solving the LASSO. An algorithm proposed in 2000 by Michael Osborne is reformulated. This reformulation consists in giving a general definition and explanation of the active set method, that generalizes the simplex algorithm to convex programming, then specifying it to the LASSO program, and separately addressing linear algebra optimizations. Although it describes the same algorithm in essence, the presentation given here aims at exhibiting clearly its mechanisms, and uses different variables. In addition to helping understand and use this algorithm that seemed to be underrated, the alternative view taken here brings light on the possibility and advantages, not foreseen by the authors, to use the method for the regularized (and more practical) problem, as well as for the constrained one. The popular homotopy (or LAR-LASSO) method is then derived from this active set method, yelding also an alternative and somewhat simplifed view of this algorithm that can compute the operator for all values of its parameter (LASSO path). Practical implementations following these formulations are shown to be the most efficient methods of LASSO-path computation, contrasting with a recent study of Jerome H. Friedman suggesting that a coordinate descent method improves by far the state-of-the-art results of homotopy, interms of speed. The third part examines how these three algorithms (active set, homotopy, and coordinate descent) can handle some limit cases, and can be applied to extended problems. The limit cases include degeneracies, like duplicated or lin- early dependent variables, or simultaneous selections/deselections of variables. The latter issue, that was dismissed in previous works, is explained and given a simple solution. Another limit case is the use of a very large, possibly infinite number of variables to select from, where the active set method presents a major advantage over the homotopy. A first extension to the LASSO is its transposition in online learning settings, where it is necessary or desirable to solve for a growing or changing observation set. Again, the lack of flexibility of the homotopy method discards it in profit of the other two. The second extension is the use of l1 penalization with other loss function than the squared residual, or together with other penalization terms, and we summarize or state to which extent and how each algorithm can be transposed for these problems.
Document type :
Theses
Complete list of metadatas

Cited literature [60 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00845441
Contributor : Philippe Preux <>
Submitted on : Wednesday, July 17, 2013 - 10:23:31 AM
Last modification on : Thursday, February 21, 2019 - 10:52:49 AM
Long-term archiving on : Friday, October 18, 2013 - 4:22:38 AM

Identifiers

  • HAL Id : tel-00845441, version 1

Collections

Citation

Manuel Loth. Active Set Algorithms for the LASSO. Machine Learning [cs.LG]. Université des Sciences et Technologie de Lille - Lille I, 2011. English. ⟨tel-00845441⟩

Share

Metrics

Record views

638

Files downloads

973