A lower bound for the optimization of finite sums, Proceedings of the International Conferences on Machine Learning (ICML), 2015. ,
Finding approximate local minima for nonconvex optimization in linear time, 2016. ,
Katyusha: the first direct acceleration of stochastic gradient methods, Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing , STOC 2017, 2016. ,
DOI : 10.1145/1015330.1015332
Natasha: Faster stochastic non-convex optimization via strongly non-convex parameter, Proceedings of the International Conferences on Machine Learning (ICML), 2017. ,
Linear coupling: An ultimate unification of gradient and mirror descent, 2014. ,
Dimension-free iteration complexity of finite sum optimization problems, Advances in Neural Information Processing Systems (NIPS), 2016. ,
Numerical methods for nondifferentiable convex optimization. Nonlinear Analysis and Optimization, pp.102-126, 1987. ,
Optimization with Sparsity-Inducing Penalties, Machine Learning, pp.1-106, 2012. ,
DOI : 10.1561/2200000015
URL : https://hal.archives-ouvertes.fr/hal-00613125
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems, SIAM Journal on Imaging Sciences, vol.2, issue.1, pp.183-202, 2009. ,
DOI : 10.1137/080716542
Smoothing and First Order Methods: A Unified Framework, SIAM Journal on Optimization, vol.22, issue.2, pp.557-580, 2012. ,
DOI : 10.1137/100818327
Nonlinear programming, Athena scientific Belmont, 1999. ,
Incremental proximal methods for large scale convex optimization, Mathematical Programming, pp.163-195, 2011. ,
DOI : 10.1137/S1052623495294797
, BIBLIOGRAPHY
Convex Optimization Algorithms, Athena Scientific, 2015. ,
Stochastic optimization with variance reduction for infinite datasets with finite-sum structure, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01375816
From error bounds to the complexity of first-order descent methods for convex functions, Mathematical Programming, 2016. ,
DOI : 10.1137/S1052623402403505
A family of variable metric proximal methods, Mathematical Programming, pp.15-47, 1995. ,
DOI : 10.1007/BF01585756
URL : https://hal.archives-ouvertes.fr/inria-00074821
Numerical Optimization: Theoretical and Practical Aspects, 2006. ,
DOI : 10.1007/978-3-662-05078-1
Convex analysis and nonlinear optimization: theory and examples, 2010. ,
Large-scale machine learning with stochastic gradient descent, Proceedings of COMPSTAT, 2010. ,
Stochastic Gradient Descent Tricks, Neural networks: Tricks of the trade, pp.421-436, 2012. ,
DOI : 10.1137/1116025
URL : http://leon.bottou.org/publications/pdf/tricks-2012.pdf
Convex optimization, 2009. ,
Quasi-Newton methods and their application to function minimisation, Mathematics of Computation, vol.21, issue.99, pp.368-381, 1967. ,
DOI : 10.1090/S0025-5718-1967-0224273-2
URL : http://www.ams.org/mcom/1967-21-099/S0025-5718-1967-0224273-2/S0025-5718-1967-0224273-2.pdf
The Convergence of a Class of Double-rank Minimization Algorithms 1. General Considerations, IMA Journal of Applied Mathematics, vol.6, issue.1, pp.76-90, 1970. ,
DOI : 10.1093/imamat/6.1.76
A geometric alternative to nesterov's accelerated gradient descent, 2015. ,
A Variable Metric Proximal Point Algorithm for Monotone Operators, SIAM Journal on Control and Optimization, vol.37, issue.2, pp.353-375, 1999. ,
DOI : 10.1137/S0363012992235547
URL : http://www.math.washington.edu/~burke/papers/qian1.ps
On the superlinear convergence of the variable metric proximal point algorithm using Broyden and BFGS matrix secant updating, Mathematical Programming, pp.157-181, 2000. ,
DOI : 10.1007/PL00011373
A Tool for the Analysis of Quasi-Newton Methods with Application to Unconstrained Minimization, SIAM Journal on Numerical Analysis, vol.26, issue.3, pp.727-739, 1989. ,
DOI : 10.1137/0726042
Global Convergence of a Cass of Quasi-Newton Methods on Convex Problems, SIAM Journal on Numerical Analysis, vol.24, issue.5, pp.1171-1190, 1987. ,
DOI : 10.1137/0724077
An inexact successive quadratic approximation method for L-1 regularized optimization, Mathematical Programming, pp.375-396, 2015. ,
DOI : 10.1198/tech.2006.s352
A Stochastic Quasi-Newton Method for Large-Scale Optimization, SIAM Journal on Optimization, vol.26, issue.2, pp.1008-1031, 2016. ,
DOI : 10.1137/140954362
URL : http://arxiv.org/pdf/1401.7020
Accelerated methods for non-convex optimization, 2016. ,
Lower Bounds for Finding Stationary Points I. preprint arXiv, pp.1710-11606, 2017. ,
convex until proven guilt " : Dimension-free acceleration of gradient descent on non-convex functions, 2017. ,
On the Complexity of Steepest Descent, Newton's and Regularized Newton's Methods for Nonconvex Unconstrained Optimization Problems, SIAM Journal on Optimization, vol.20, issue.6, pp.2833-2852, 2010. ,
DOI : 10.1137/090774100
On the complexity of finding first-order critical points in constrained nonlinear optimization, Mathematical Programming, 2014. ,
A remark on accelerated block coordinate descent for computing the proximity operators of a sum of convex functions, SMAI Journal of Computational Mathematics, vol.1, pp.29-54, 2015. ,
DOI : 10.5802/smai-jcm.3
URL : https://hal.archives-ouvertes.fr/hal-01099182
Proximal quasi-Newton methods for nondifferentiable convex optimization, Mathematical Programming, vol.85, issue.2, pp.313-334, 1999. ,
DOI : 10.1007/s101070050059
URL : http://halo.kuamp.kyoto-u.ac.jp/zagato/member/staff/fuku/./papers/proxNewton.ps.Z
Proximal smoothness and the lower-C 2 property, Journal of Convex Analysis, vol.2, issue.12, pp.117-144, 1995. ,
Proximal splitting methods in signal processing In Fixedpoint algorithms for inverse problems in science and engineering, pp.185-212, 2011. ,
Convergence of some algorithms for convex minimization, Mathematical Programming, pp.261-275, 1993. ,
DOI : 10.1007/BF01585170
An iterative thresholding algorithm for linear inverse problems with a sparsity constraint, Communications on Pure and Applied Mathematics, vol.58, issue.11, p.57, 2004. ,
DOI : 10.1016/S0165-1684(03)00150-6
URL : http://onlinelibrary.wiley.com/doi/10.1002/cpa.20042/pdf
Variable Metric Method for Minimization, SIAM Journal on Optimization, vol.1, issue.1, pp.1-17, 1991. ,
DOI : 10.1137/0801001
URL : https://www.osti.gov/servlets/purl/4222000
, BIBLIOGRAPHY
On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions. Optimization Letters, pp.1185-1199, 2017. ,
A simple practical accelerated method for finite sums, Advances in Neural Information Processing Systems (NIPS), 2016. ,
SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives, Advances in Neural Information Processing Systems (NIPS), 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01016843
Finito: A faster, permutable incremental gradient method for big data problems, Proceedings of the International Conferences on Machine Learning (ICML), 2014. ,
A characterization of superlinear convergence and its application to quasi-Newton methods, Mathematics of Computation, vol.28, issue.126, pp.549-560, 1974. ,
DOI : 10.1090/S0025-5718-1974-0343581-1
Quasi-Newton Methods, Motivation and Theory, SIAM Review, vol.19, issue.1, pp.46-89, 1977. ,
DOI : 10.1137/1019005
URL : https://hal.archives-ouvertes.fr/hal-01495720
First-order methods of smooth convex optimization with inexact oracle, Mathematical Programming, vol.110, issue.3, pp.37-75, 2014. ,
DOI : 10.1007/978-3-642-82118-9
Efficiency of minimizing compositions of convex functions and smooth maps, Mathematical Programming, vol.31, issue.3, 2016. ,
DOI : 10.1007/BF02591949
An Optimal First Order Method Based on Optimal Quadratic Averaging, SIAM Journal on Optimization, vol.28, issue.1, 2016. ,
DOI : 10.1137/16M1072528
Randomized Smoothing for Stochastic Optimization, SIAM Journal on Optimization, vol.22, issue.2, pp.674-701, 2012. ,
DOI : 10.1137/110831659
, Pattern Classification, 2000.
Regularization of inverse problems, 1996. ,
Curvature measures. Transactions of the, pp.418-491, 1959. ,
Restarting accelerated gradient methods with a rough strong convexity estimate, 2016. ,
A new approach to variable metric algorithms. The computer journal, pp.317-322, 1970. ,
A rapidly convergent descent method for minimization. The computer journal, pp.163-168, 1963. ,
Hybrid Deterministic-Stochastic Methods for Data Fitting, SIAM Journal on Scientific Computing, vol.34, issue.3, pp.1380-1405, 2012. ,
DOI : 10.1137/110830629
URL : https://hal.archives-ouvertes.fr/inria-00626571
The elements of statistical learning, series in statistics, 2001. ,
Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization, Proceedings of the International Conferences on Machine Learning (ICML), 2015. ,
Descentwise inexact proximal algorithms for smooth optimization, Computational Optimization and Applications, vol.11, issue.1, pp.755-769, 2012. ,
DOI : 10.1137/1011036
URL : https://hal.archives-ouvertes.fr/hal-00628777
A Globally and Superlinearly Convergent Algorithm for Nonsmooth Convex Minimization, SIAM Journal on Optimization, vol.6, issue.4, pp.1106-1120, 1996. ,
DOI : 10.1137/S1052623494278839
Chapter ix applications of the method of multipliers to variational inequalities. Studies in mathematics and its applications, pp.299-331, 1983. ,
Escaping from saddle points-online stochastic gradient for tensor decomposition, Conference on Learning Theory, 2015. ,
Accelerated gradient methods for nonconvex nonlinear and stochastic programming, Mathematical Programming, pp.59-99, 2016. ,
DOI : 10.1002/0471722138
URL : http://arxiv.org/pdf/1310.3787
Generalized Uniformly Optimal Methods for Nonlinear Programming, 2015. ,
Nonsmooth minimization using smooth envelope functions, 2016. ,
A family of variable-metric methods derived by variational means, Mathematics of Computation, vol.24, issue.109, pp.23-26, 1970. ,
DOI : 10.1090/S0025-5718-1970-0258249-6
Stochastic block BFGS: Squeezing more curvature out of data, Proceedings of the International Conferences on Machine Learning (ICML), 2016. ,
On the Convergence of the Proximal Point Algorithm for Convex Minimization, SIAM Journal on Control and Optimization, vol.29, issue.2, pp.403-419, 1991. ,
DOI : 10.1137/0329022
, BIBLIOGRAPHY
New Proximal Point Algorithms for Convex Minimization, SIAM Journal on Optimization, vol.2, issue.4, pp.649-664, 1992. ,
DOI : 10.1137/0802032
Statistical Learning With Sparsity: The Lasso And Generalizations, 2015. ,
DOI : 10.1201/b18401
An Accelerated Inexact Proximal Point Algorithm for Convex Minimization, Journal of Optimization Theory and Applications, vol.2, issue.2, pp.536-548, 2012. ,
DOI : 10.1137/080716542
Convex analysis and minimization algorithms I, 1996. ,
DOI : 10.1007/978-3-662-02796-7
Convex analysis and minimization algorithms. II, 1996. ,
DOI : 10.1007/978-3-662-06409-2
Group lasso with overlap and graph lasso, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2009. ,
DOI : 10.1145/1553374.1553431
Structured variable selection with sparsity-inducing norms, Journal of Machine Learning Research, vol.12, pp.2777-2824, 2011. ,
URL : https://hal.archives-ouvertes.fr/inria-00377732
How to escape saddle points efficiently, 2017. ,
Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent, 2017. ,
Accelerating stochastic gradient descent using predictive variance reduction, Advances in Neural Information Processing Systems (NIPS), 2013. ,
An optimal randomized incremental gradient method, Mathematical Programming, vol.14, issue.1, 2015. ,
DOI : 10.1007/s10107-014-0839-0
Proximal Newton-type methods for convex optimization, Advances in Neural Information Processing Systems (NIPS), 2012. ,
Gradient descent only converges to minimizers, Conference on Learning Theory, pp.1246-1257, 2016. ,
Practical Aspects of the Moreau--Yosida Regularization: Theoretical Preliminaries, SIAM Journal on Optimization, vol.7, issue.2, pp.367-385, 1997. ,
DOI : 10.1137/S1052623494267127
Variable metric bundle methods: From conceptual to implementable forms, Mathematical Programming, vol.2, issue.3, pp.393-410, 1997. ,
DOI : 10.1007/978-3-642-68874-4_11
Accelerated proximal gradient methods for nonconvex programming, Advances in Neural Information Processing Systems (NIPS), 2015. ,
A universal catalyst for first-order optimization, Advances in Neural Information Processing Systems (NIPS), 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01160728
A generic quasi-newton algorithm for faster gradient-based optimization, 2017. ,
An accelerated proximal coordinate gradient method, Advances in Neural Information Processing Systems (NIPS), 2014. ,
On the limited memory BFGS method for large scale optimization, Mathematical Programming, vol.32, issue.2, pp.503-528, 1989. ,
DOI : 10.1007/BF01589116
Optimization with first-order surrogate functions, Proceedings of the 30th International Conference on Machine Learning (ICML), 2013. ,
Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning, SIAM Journal on Optimization, vol.25, issue.2, pp.829-855, 2015. ,
DOI : 10.1137/140957639
Sparse modeling for image and vision processing. Foundations and Trends in Computer Graphics and Vision, pp.85-283, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01081139
A wavelet tour of signal processing: the sparse way Academic press, 2008. ,
Régularisation d'inéquations variationnelles par approximations successives. Revue française d'informatique et de recherche opérationnelle, série rouge, pp.154-158, 1970. ,
A quasi-second-order proximal bundle algorithm, Mathematical Programming, pp.51-72, 1996. ,
DOI : 10.1007/978-3-642-82450-0_12
Global convergence of online limited memory bfgs, Journal of Machine Learning Research, vol.16, issue.1, pp.3151-3181, 2015. ,
On the Global Convergence of Broyden's Method, Mathematics of Computation, vol.30, issue.135, pp.523-540, 1976. ,
DOI : 10.2307/2005323
Fonctions convexes duales et points proximaux dans un espace hilbertien, Comptes Rendus de l'Académie des Sciences de Paris, pp.2897-2899, 1962. ,
A linearly-convergent stochastic L-BFGS algorithm, Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2016. ,
, BIBLIOGRAPHY
Stochastic dual averaging methods using variance reduction techniques for regularized empirical risk minimization problems, 2016. ,
Sparse Approximate Solutions to Linear Systems, SIAM Journal on Computing, vol.24, issue.2, pp.227-234, 1995. ,
DOI : 10.1137/S0097539792240406
Problem complexity and method efficiency in optimization, 1983. ,
A method of solving a convex programming problem with convergence rate, Soviet Mathematics Doklady, vol.27, issue.1 22, pp.372-376, 1983. ,
Introductory Lectures on Convex Optimization: A Basic Course, 2004. ,
DOI : 10.1007/978-1-4419-8853-9
Smooth minimization of non-smooth functions, Mathematical Programming, vol.269, issue.1, pp.127-152, 2005. ,
DOI : 10.1007/s10107-004-0552-5
Primal-dual subgradient methods for convex problems, Mathematical Programming, vol.8, issue.1, pp.221-259, 2009. ,
DOI : 10.1007/978-3-642-82118-9
How to make the gradients small. OPTIMA, MPS Newsletter, pp.10-11, 2012. ,
Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems, SIAM Journal on Optimization, vol.22, issue.2, pp.341-362, 2012. ,
DOI : 10.1137/100802001
Gradient methods for minimizing composite functions, Mathematical Programming, pp.125-161, 2013. ,
DOI : 10.1109/TIT.2005.864420
Cubic regularization of Newton method and its global performance, Mathematical Programming, pp.177-205, 2006. ,
DOI : 10.1007/s10107-006-0706-8
Updating quasi-Newton matrices with limited storage, Mathematics of Computation, vol.35, issue.151, pp.773-782, 1980. ,
DOI : 10.1090/S0025-5718-1980-0572855-7
Numerical optimization, 2006. ,
DOI : 10.1007/b98874
Adaptive restart for accelerated gradient schemes. Foundations of computational mathematics, pp.715-732, 2015. ,
Behavior of Accelerated Gradient Methods Near Critical Points of Nonconvex Problems ArXiv e-prints, 2017. ,
Catalyst acceleration for gradient-based non-convex optimization. arXiv preprint arXiv:1703, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01536017
Proximal Algorithms, Foundations and Trends?? in Optimization, vol.1, issue.3, pp.123-231, 2014. ,
DOI : 10.1561/2400000003
, BIBLIOGRAPHY
Prox-regular functions in variational analysis. Transactions of the, pp.1805-1838, 1996. ,
A new algorithm for unconstrained optimization. Nonlinear programming, pp.31-65, 1970. ,
On the Convergence of the Variable Metric Algorithm, IMA Journal of Applied Mathematics, vol.7, issue.1, pp.21-36, 1971. ,
DOI : 10.1093/imamat/7.1.21
How bad are the BFGS and DFP methods when the objective function is quadratic?, Mathematical Programming, vol.13, issue.1, pp.34-47, 1986. ,
DOI : 10.1007/BF01582161
A Generalized Forward-Backward Splitting, SIAM Journal on Imaging Sciences, vol.6, issue.3, pp.1199-1226, 2013. ,
DOI : 10.1137/120872802
URL : https://hal.archives-ouvertes.fr/hal-00613637
A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization, SIAM Journal on Optimization, vol.23, issue.2, pp.1126-1153, 2013. ,
DOI : 10.1137/120891009
Stochastic variance reduction for nonconvex optimization, International conference on machine learning (ICML), 2016. ,
Proximal stochastic methods for nonsmooth nonconvex finite-sum optimization, Advances in Neural Information Processing Systems (NIPS), 2016. ,
Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function, Mathematical Programming, pp.1-38, 2014. ,
DOI : 10.1111/j.1467-9868.2005.00503.x
Monotone Operators and the Proximal Point Algorithm, SIAM Journal on Control and Optimization, vol.14, issue.5, pp.877-898, 1976. ,
DOI : 10.1137/0314056
Favorable classes of Lipschitz-continuous functions in subgradient optimization, Progress in nondifferentiable optimization of IIASA Collaborative Proc. Ser. CP-82, pp.125-143, 1982. ,
Variational analysis, of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences, 1998. ,
DOI : 10.1007/978-3-642-02431-3
Inexact and accelerated proximal point algorithms, Journal of Convex Analysis, vol.19, issue.4, pp.1167-1192, 2012. ,
Practical inexact proximal quasi-Newton method with global complexity analysis, Mathematical Programming, pp.495-529, 2016. ,
DOI : 10.1109/TSP.2009.2016892
Projected Newton-type methods in machine learning, pp.305-330, 2011. ,
Convergence rates of inexact proximal-gradient methods for convex optimization, Advances in Neural Information Processing Systems (NIPS), 2011. ,
URL : https://hal.archives-ouvertes.fr/inria-00618152
Minimizing finite sums with the stochastic average gradient, Mathematical Programming, vol.160, issue.1, pp.83-112, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-00860051
Integration methods and accelerated optimization algorithms, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01474045
SDCA without duality, regularization, and individual convexity, Proceedings of the International Conferences on Machine Learning (ICML), 2016. ,
Understanding machine learning: From theory to algorithms, 2014. ,
DOI : 10.1017/CBO9781107298019
Proximal stochastic dual coordinate ascent, 2012. ,
Stochastic dual coordinate ascent methods for regularized loss minimization, Journal of Machine Learning Research, vol.14, issue.Feb, pp.567-599, 2013. ,
Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization, Mathematical Programming, pp.105-145, 2016. ,
DOI : 10.1023/A:1012498226479
Conditioning of quasi-Newton methods for function minimization, Mathematics of Computation, vol.24, issue.111, pp.647-656, 1970. ,
DOI : 10.1090/S0025-5718-1970-0274029-X
Adjustment of an Inverse Matrix Corresponding to a Change in One Element of a Given Matrix, The Annals of Mathematical Statistics, vol.21, issue.1, pp.124-127, 1950. ,
DOI : 10.1214/aoms/1177729893
Minimization methods for non-differentiable functions, 2012. ,
DOI : 10.1007/978-3-642-82118-9
A unified framework for some inexact proximal point algorithms . Numerical Functional Analysis and Optimization, 2001. ,
Forward???backward quasi-Newton methods for nonsmooth optimization problems, Computational Optimization and Applications, vol.26, issue.3, pp.443-487, 2017. ,
DOI : 10.1137/0726042
Exact worst-case performance of first-order methods for composite convex optimization, SIAM Journal on Optimization, vol.27, issue.3, pp.1283-1313, 2017. ,
Forward-Backward Envelope for the Sum of Two Nonconvex Functions: Further Properties and Nonmonotone Linesearch Algorithms, SIAM Journal on Optimization, vol.28, issue.3, 2016. ,
DOI : 10.1137/16M1080240
Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society. Series B (Methodological), pp.267-288, 1996. ,
On accelerated proximal gradient methods for convex-concave optimization. submitted to SIAM, Journal on Optimization, 2008. ,
Foundations of the theory of learning systems, 1973. ,
Adaptation and learning in automatic systems, 1971. ,
The nature of statistical learning theory Springer science & business media, 2013. ,
The algebraic eigenvalue problem, 1965. ,
Tight complexity bounds for optimizing composite objectives, Advances in Neural Information Processing Systems (NIPS), 2016. ,
Dual averaging methods for regularized stochastic learning and online optimization, Journal of Machine Learning Research, vol.11, pp.2543-2596, 2010. ,
A Proximal Stochastic Gradient Method with Progressive Variance Reduction, SIAM Journal on Optimization, vol.24, issue.4, pp.2057-2075, 2014. ,
DOI : 10.1137/140961791
Functional analysis, 1980. ,
A quasi-Newton approach to non-smooth convex optimization, Proceedings of the 25th international conference on Machine learning, ICML '08, 2008. ,
DOI : 10.1145/1390156.1390309
Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.58, issue.1, pp.49-67, 2006. ,
DOI : 10.1198/016214502753479356
Stochastic primal-dual coordinate method for regularized empirical risk minimization, Proceedings of the 32nd International Conference on Machine Learning (ICML), 2015. ,
Stochastic primal-dual coordinate method for regularized empirical risk minimization, Proceedings of the International Conferences on Machine Learning (ICML), 2015. ,
Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.5, issue.2, pp.301-320, 2005. ,
DOI : 10.1073/pnas.201162998