, Bibliography
Competing in the dark: An efficient algorithm for bandit linear optimization, Proceedings of the International Conference on Learning Theory (COLT), 2008. ,
A lower bound for the optimization of finite sums, Proceedings of the Conference on Machine Learning (ICML), 2015. ,
Information-Theoretic Lower Bounds on the Oracle Complexity of Stochastic Convex Optimization, IEEE Transactions on Information Theory, vol.58, issue.5, pp.3235-3249, 2012. ,
DOI : 10.1109/TIT.2011.2182178
Unsupervised Learning from Narrated Instruction Videos, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. ,
DOI : 10.1109/CVPR.2016.495
URL : https://hal.archives-ouvertes.fr/hal-01171193
Linear coupling: An ultimate unification of gradient and mirror descent, Proceedings of the Innovations in Theoretical Computer Science (ITCS), 2017. ,
A second-order gradient-like dissipative dynamical system with Hessian-driven damping. Application to optimization and mechanics, J. Math. Pures Appl, issue.98, pp.81747-779, 2002. ,
Living on the edge: phase transitions in convex programs with random data, Information and Inference, vol.3, issue.3, 2014. ,
DOI : 10.1093/imaiai/iau005
Teoreticheskie osnovy i konstruirovanie chislennykh algoritmov zadach matematichesko? ? fiziki, Nauka, 1979. ,
On the iteration complexity of oblivious first-order optimization algorithms, Proceedings of the Conference on Machine Learning (ICML), 2016. ,
Random Dynamical Systems, 1998. ,
K-means++: The advantages of careful seeding, Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), 2007. ,
A Spectral Algorithm for Seriation and the Consecutive Ones Problem, SIAM Journal on Computing, vol.28, issue.1, pp.297-310, 1998. ,
DOI : 10.1137/S0097539795285771
The Rate of Convergence of Nesterov's Accelerated Forward-Backward Method is Actually Faster Than $1/k^2$, SIAM Journal on Optimization, vol.26, issue.3, pp.1824-1834, 2016. ,
DOI : 10.1137/15M1046095
Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity, Mathematical Programming, vol.23, issue.3, pp.1-53, 2016. ,
DOI : 10.1137/110844805
URL : https://hal.archives-ouvertes.fr/hal-01821929
Fast convex optimization via inertial dynamics with Hessian driven damping, Journal of Differential Equations, vol.261, issue.10, pp.5734-5783, 2016. ,
DOI : 10.1016/j.jde.2016.08.020
URL : https://hal.archives-ouvertes.fr/hal-02072674
Robust linear least squares regression, The Annals of Statistics, vol.39, issue.5, pp.2766-2794, 2011. ,
DOI : 10.1214/11-AOS918SUPP
URL : https://hal.archives-ouvertes.fr/hal-00522534
An Empirical Distribution Function for Sampling with Incomplete Information, The Annals of Mathematical Statistics, vol.26, issue.4, pp.641-647 ,
DOI : 10.1214/aoms/1177728423
Relative loss bounds for on-line density estimation with the exponential family of distributions, Mach. Learn, vol.43, issue.3, 2001. ,
Self-concordant analysis for logistic regression, Electronic Journal of Statistics, vol.4, issue.0, pp.384-414, 2010. ,
DOI : 10.1214/09-EJS521
URL : https://hal.archives-ouvertes.fr/hal-00426227
Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression, J. Mach. Learn. Res, vol.15, issue.1, pp.595-627, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-00804431
Duality Between Subgradient and Conditional Gradient Methods, SIAM Journal on Optimization, vol.25, issue.1, pp.115-129, 2015. ,
DOI : 10.1137/130941961
URL : https://hal.archives-ouvertes.fr/hal-00757696
DIFFRAC : a discriminative and flexible framework for clustering, Advances in Neural Information Processing Systems (NIPS), 2007. ,
Non-asymptotic analysis of stochastic approximation algorithms for machine learning, Advances in Neural Information Processing Systems (NIPS), 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00608041
Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n), Advances in Neural Information Processing Systems (NIPS), 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00831977
Optimization with Sparsity-Inducing Penalties, Foundations and Trends?? in Machine Learning, vol.4, issue.1, pp.1-106, 2012. ,
DOI : 10.1561/2200000015
URL : https://hal.archives-ouvertes.fr/hal-00613125
Statistical Inference under Order Restrictions. The Theory and Application of Isotonic Regression, 1972. ,
Legendre functions and the method of random Bregman projections, J. Convex Anal, vol.4, issue.1, pp.27-67, 1997. ,
Convex Analysis and Monotone Operator Theory in Hilbert Spaces, CMS Books in Mathematics, 2011. ,
DOI : 10.1007/978-3-319-48311-5
URL : https://hal.archives-ouvertes.fr/hal-00643354
A Descent Lemma Beyond Lipschitz Gradient Continuity: First-Order Methods Revisited and Applications, Mathematics of Operations Research, vol.42, issue.2, 2016. ,
DOI : 10.1287/moor.2016.0817
URL : http://publications.ut-capitole.fr/25852/1/25852.pdf
Mirror descent and nonlinear projected subgradient methods for convex optimization, Operations Research Letters, vol.31, issue.3, pp.167-175, 2003. ,
DOI : 10.1016/S0167-6377(02)00231-6
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems, SIAM Journal on Imaging Sciences, vol.2, issue.1, pp.183-202, 2009. ,
DOI : 10.1137/080716542
URL : http://ie.technion.ac.il/%7Ebecka/papers/finalicassp2009.pdf
Sharp oracle inequalities for least squares estimators in shape restricted regression. arXiv preprint, 2015. ,
DOI : 10.1214/17-aos1566
URL : http://arxiv.org/pdf/1510.08029
Private communication, 2016. ,
Sharp oracle bounds for monotone and convex regression through aggregation, J. Mach. Learn. Res, vol.16, pp.1879-1892, 2015. ,
A note on cluster analysis and dynamic programming, Mathematical Biosciences, vol.18, issue.3-4, 1973. ,
DOI : 10.1016/0025-5564(73)90007-2
Lectures on Modern Convex Optimization, MPS Series on Optimization. Society for Industrial and Applied Mathematics, 2001. ,
DOI : 10.1137/1.9780898718829
URL : http://iew3.technion.ac.il/Labs/Opt/opt/LN/Final.pdf
Representation Learning: A Review and New Perspectives, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.8, pp.1798-1828, 2013. ,
DOI : 10.1109/TPAMI.2013.50
URL : http://www.cs.princeton.edu/courses/archive/spring13/cos598C/Representation Learning - A Review and New Perspectives.pdf
Adaptive Algorithms and Stochastic Approximations, 1990. ,
DOI : 10.1007/978-3-642-75894-2
Complexity theoretic lower bounds for sparse principal component detection, Proceedings of the International Conference on Learning Theory (COLT), 2013. ,
, Bertsekas. Nonlinear Programming. Athena scientific, 1999.
Some problems on the estimation of unimodal densities, Statist. Sinica, vol.6, issue.1, 1996. ,
Estimation of unimodal densities without smoothness assumptions, The Annals of Statistics, vol.25, issue.3, pp.970-981, 1997. ,
DOI : 10.1214/aos/1069362733
PIECEWISE-POLYNOMIAL APPROXIMATIONS OF FUNCTIONS OF THE CLASSES $ W_{p}^{\alpha}$, Mathematics of the USSR-Sbornik, vol.2, issue.3, pp.73331-355, 1967. ,
DOI : 10.1070/SM1967v002n03ABEH002343
In search of non-Gaussian components of a high-dimensional distribution, J. Mach. Learn. Res, vol.7, pp.247-282, 2006. ,
Finding Actors and Actions in Movies, 2013 IEEE International Conference on Computer Vision, 2013. ,
DOI : 10.1109/ICCV.2013.283
URL : https://hal.archives-ouvertes.fr/hal-00904991
Smooth optimization with approximate gradient, SIAM J. Optim, vol.43, issue.3, pp.1266-1292, 2003. ,
Stochastic approximation with two time scales, Systems & Control Letters, vol.29, issue.5, pp.291-294, 1997. ,
DOI : 10.1016/S0167-6911(97)90015-3
Stochastic Approximation: a Dynamical Systems Viewpoint, 2008. ,
Convex Analysis and Nonlinear Optimization, CMS Books in Mathematics, vol.3, 2000. ,
The tradeoffs of large scale learning, Advances in Neural Information Processing Systems (NIPS), 2008. ,
On-line learning for very large data sets, Applied Stochastic Models in Business and Industry, vol.14, issue.2, pp.137-151, 2005. ,
DOI : 10.1007/978-3-642-75894-2
A high-dimensional Wilks phenomenon. Probab. Theory Related Fields, pp.405-433, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00622983
Concentration Inequalities, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00751496
Manopt, a Matlab toolbox for optimization on manifolds, J. Mach. Learn. Res, 2014. ,
On the singularity probability of discrete random matrices, Journal of Functional Analysis, vol.258, issue.2, pp.559-603, 2010. ,
DOI : 10.1016/j.jfa.2009.04.016
Linear time isotonic and unimodal regression in the L 1 and L 1 norms, J. Discrete Algorithms, vol.4, issue.4, 2006. ,
Convex Optimization, 2004. ,
, Linear Matrix Inequalities in System and Control Theory, 1994.
The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming, USSR Computational Mathematics and Mathematical Physics, vol.7, issue.3, pp.620-631, 1967. ,
DOI : 10.1016/0041-5553(67)90040-7
Least squares algorithms under unimodality and non-negativity constraints, Journal of Chemometrics, vol.12, issue.4, pp.223-247, 1998. ,
DOI : 10.1002/(SICI)1099-128X(199807/08)12:4<223::AID-CEM511>3.0.CO;2-2
Convex Optimization: Algorithms and Complexity, Foundations and Trends?? in Machine Learning, vol.8, issue.3-4, pp.3-4 ,
DOI : 10.1561/2200000050
A geometric alternative to nesterov's accelerated gradient descent, 2015. ,
Statistical Inference. Statistics/Probability Series, 1990. ,
Méthode générale pour la résolution des systemes d'équations simultanées, Comp. Rend. Sci. Paris, vol.25, pp.536-538, 1847. ,
, Prediction, Learning, and Games, 2006.
On the Generalization Ability of On-Line Learning Algorithms, IEEE Transactions on Information Theory, vol.50, issue.9, pp.2050-2057, 2004. ,
DOI : 10.1109/TIT.2004.833339
The Convex Geometry of Linear Inverse Problems, Foundations of Computational Mathematics, vol.1, issue.10, pp.805-849, 2012. ,
DOI : 10.1007/978-1-4613-8431-1
A new perspective on least squares under convex constraint, The Annals of Statistics, vol.42, issue.6, pp.2340-2381 ,
DOI : 10.1214/14-AOS1254
Matrix estimation by Universal Singular Value Thresholding, The Annals of Statistics, vol.43, issue.1, pp.177-214, 2015. ,
DOI : 10.1214/14-AOS1272
Adaptive risk bounds in unimodal regression. arXiv preprint, 2015. ,
On estimation in tournaments and graphs under monotonicity constraints. arXiv preprint, 2016. ,
On risk bounds in isotonic and other shape restricted regression problems, The Annals of Statistics, vol.43, issue.4, pp.1774-1800, 2015. ,
DOI : 10.1214/15-AOS1324SUPP
On matrix estimation under monotonicity constraints, Bernoulli, vol.24, issue.2, p.2017 ,
DOI : 10.3150/16-BEJ865
Convergence Analysis of a Proximal-Like Minimization Algorithm Using Bregman Functions, SIAM Journal on Optimization, vol.3, issue.3, pp.538-543, 1993. ,
DOI : 10.1137/0803026
On a Stochastic Approximation Method, The Annals of Mathematical Statistics, vol.25, issue.3, pp.463-483, 1954. ,
DOI : 10.1214/aoms/1177728716
Functional Analysis, Calculus of Variations and Optimal Control, Graduate Texts in Mathematics, vol.264, 2013. ,
DOI : 10.1007/978-1-4471-4820-3
URL : https://hal.archives-ouvertes.fr/hal-00865914
Gossip dual averaging for decentralized optimization of pairwise functions, Proceedings of the Conference on Machine Learning (ICML), 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01329315
Minimax rates in permutation estimation for feature matching, J. Mach. Learn. Res, vol.17, issue.6, pp.1-32, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-00874514
Proximal Splitting Methods in Signal Processing, Fixed-Point Algorithms for Inverse Problems in Science and Engineering, pp.185-212, 2011. ,
DOI : 10.1007/978-1-4419-9569-8_10
URL : https://hal.archives-ouvertes.fr/hal-00643807
Better mini-batch algorithms via accelerated gradient methods, Advances in Neural Information Processing Systems (NIPS), 2011. ,
Nearest neighbor pattern classification, IEEE Transactions on Information Theory, vol.13, issue.1, pp.21-27, 1967. ,
DOI : 10.1109/TIT.1967.1053964
URL : http://ssg.mit.edu/cal/abs/2000_spring/np_dens/classification/cover67.pdf
Elements of Information Theory, 2006. ,
DOI : 10.1002/047174882x
Gradient methods of maximization, Pacific Journal of Mathematics, vol.5, issue.1, pp.33-50, 1955. ,
DOI : 10.2140/pjm.1955.5.33
URL : http://msp.org/pjm/1955/5-1/pjm-v5-n1-p03-s.pdf
The method of steepest descent for non-linear minimization problems, Quarterly of Applied Mathematics, vol.2, issue.3, pp.258-261, 1944. ,
DOI : 10.1090/qam/10667
URL : https://www.ams.org/qam/1944-02-03/S0033-569X-1944-10667-3/S0033-569X-1944-10667-3.pdf
Aggregation of affine estimators, Electronic Journal of Statistics, vol.8, issue.1, pp.302-327, 2014. ,
DOI : 10.1214/14-EJS886
URL : http://doi.org/10.1214/14-ejs886
Theoretical guarantees for approximate sampling from a smooth and log-concave density. to appear in jrss b, arXiv preprint arXiv:1412, 2014. ,
DOI : 10.1111/rssb.12183
URL : http://arxiv.org/pdf/1412.7392
Learning k-modal distributions via testing, Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), 2012. ,
DOI : 10.1137/1.9781611973099.108
URL : https://epubs.siam.org/doi/pdf/10.1137/1.9781611973099.108
Testing k-modal distributions: Optimal algorithms via reductions, Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), 2013. ,
DOI : 10.1137/1.9781611973105.131
URL : https://epubs.siam.org/doi/pdf/10.1137/1.9781611973105.131
Smooth Optimization with Approximate Gradient, SIAM Journal on Optimization, vol.19, issue.3, pp.1171-1183, 2008. ,
DOI : 10.1137/060676386
Experimental tests of a stochastic decision theory. Measurement: Definitions and theories, 1959. ,
Convex methods for transduction, Advances in Neural Information Processing Systems (NIPS), 2003. ,
Discriminative cluster analysis, Proceedings of the 23rd international conference on Machine learning , ICML '06, 2006. ,
DOI : 10.1145/1143844.1143875
SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives, Advances in Neural Information Processing Systems (NIPS), 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01016843
Averaged least-mean-squares: bias-variance trade-offs and optimal sampling distributions, Proceedings of the International Conference on Artificial Intelligence and Statistics, p.2015 ,
Optimal distributed online prediction using mini-batches, J. Mach. Learn. Res, vol.13, pp.165-202, 2012. ,
First-order methods with inexact oracle: the strongly convex case, CORE Discussion Papers, 2013. ,
First-order methods of smooth convex optimization with inexact oracle, Mathematical Programming, vol.110, issue.3, pp.37-75, 2014. ,
DOI : 10.1007/978-3-642-82118-9
A Probabilistic Theory of Pattern Recognition, Applications of Mathematics, vol.31, 1996. ,
DOI : 10.1007/978-1-4612-0711-5
Sparse non Gaussian component analysis by semidefinite programming, Machine Learning, vol.290, issue.2, pp.211-238, 2013. ,
DOI : 10.1007/978-1-4757-2545-2
URL : https://hal.archives-ouvertes.fr/hal-00978264
Nonparametric stochastic approximation with large step-sizes, The Annals of Statistics, vol.44, issue.4, pp.1363-1399, 2015. ,
DOI : 10.1214/15-AOS1391
URL : https://hal.archives-ouvertes.fr/hal-01053831
Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression. arXiv preprint, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01275431
Adaptive dimension reduction using discriminant analysis and Kmeans clustering, Proceedings of the Conference on Machine Learning (ICML), p.265, 2007. ,
DOI : 10.1145/1273496.1273562
URL : http://www.cs.fiu.edu/~taoli/pub/Ding-Li-ICML2007.pdf
Gelfand n-widths and the method of least squares, Statistics Technical Report, vol.282, 1990. ,
Message-passing algorithms for compressed sensing, Proceedings of the National Academy of Sciences, pp.18914-18919, 2009. ,
DOI : 10.1080/14786437708235992
URL : http://www.pnas.org/content/106/45/18914.full.pdf
Note on a paper of Halanay on stability for finite difference equations, Archive for Rational Mechanics and Analysis, vol.12, issue.3, pp.241-243, 1965. ,
DOI : 10.1007/BF00281223
Local asymptotics for some stochastic optimization problems: optimality, constraint identification, and dual averaging, 2016. ,
Efficient online and batch learning using forward backward splitting, J. Mach. Learn. Res, vol.10, pp.2899-2934, 2009. ,
Composite objective mirror descent, Proceedings of the International Conference on Learning Theory (COLT), 2010. ,
Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling, IEEE Transactions on Automatic Control, vol.57, issue.3, pp.592-606, 2012. ,
DOI : 10.1109/TAC.2011.2161027
URL : http://arxiv.org/pdf/1005.2012
Random Iterative Models, 1997. ,
DOI : 10.1007/978-3-662-12880-0
Non-asymptotic convergence analysis for the unadjusted langevin algorithm, Ann. Appl. Prob, 2017. ,
DOI : 10.1214/16-aap1238
URL : https://hal.archives-ouvertes.fr/hal-01176132
Stochastic gradient Richardson-Romberg markov chain monte carlo, Advances in Neural Information Processing Systems (NIPS), 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01354064
Maximum likelihood estimation of smooth monotone and unimodal densities, Ann. Statist, vol.28, issue.3, 2000. ,
Methods of solution of nonlinear extremal problems, Cybernetics, vol.2, issue.4, pp.1-14, 1966. ,
DOI : 10.1007/BF01071403
The method of generalized stochastic gradients and stochastic quasi- Fejér sequences, Kibernetika, issue.2, pp.73-83, 1969. ,
On Asymptotic Normality in Stochastic Approximation, The Annals of Mathematical Statistics, vol.39, issue.4, pp.1327-1332, 1968. ,
DOI : 10.1214/aoms/1177698258
URL : http://doi.org/10.1214/aoms/1177698258
Binary choice probabilities: on the varieties of stochastic transitivity, Journal of Mathematical Psychology, vol.10, issue.4, 1973. ,
DOI : 10.1016/0022-2496(73)90021-7
Convex Relaxations for Permutation Problems, Advances in Neural Information Processing Systems (NIPS), 2013. ,
DOI : 10.1137/130947362
URL : https://hal.archives-ouvertes.fr/hal-01239317
Statistical Models: Theory and Practice, 2009. ,
DOI : 10.1017/CBO9780511815867
Projection Pursuit Regression, Journal of the American Statistical Association, vol.4, issue.376, pp.817-823, 1981. ,
DOI : 10.1080/03610927508827223
Improved approximation algorithms for MAX k-CUT and MAX BISECTION, Integer Programming and Combinatorial Optimization, 1995. ,
DOI : 10.1007/3-540-59408-6_37
URL : http://karush.rutgers.edu/~alizadeh/Sdppage/Frieze/k_cut.ps
Unimodal regression, Journal of the Royal Statistical Society. Series D, vol.35, issue.4, pp.479-485, 1986. ,
Un-regularizing: Approximate proximal point and faster stochastic algorithms for empirical risk minimization, Proceedings of the Conference on Machine Learning (ICML), 2015. ,
Incidence matrices with the consecutive 1's property, Bulletin of the American Mathematical Society, vol.70, issue.5, pp.681-684, 1964. ,
DOI : 10.1090/S0002-9904-1964-11160-5
URL : http://www.ams.org/bull/1964-70-05/S0002-9904-1964-11160-5/S0002-9904-1964-11160-5.pdf
Rate-optimal graphon estimation, The Annals of Statistics, vol.43, issue.6, pp.2624-2652 ,
DOI : 10.1214/15-AOS1354SUPP
URL : http://arxiv.org/pdf/1410.5837
Some simplified NP-complete graph problems, Theoretical Computer Science, vol.1, issue.3, pp.237-267, 1976. ,
DOI : 10.1016/0304-3975(76)90059-1
URL : https://doi.org/10.1016/0304-3975(76)90059-1
Matrix completion has no spurious local minimum, Advances in Neural Information Processing Systems (NIPS), 2016. ,
Algorithm AS 257: Isotonic Regression for Umbrella Orderings, Applied Statistics, vol.39, issue.3, pp.397-402, 1990. ,
DOI : 10.2307/2347399
-norm algorithms, Proceedings of the twelfth annual conference on Computational learning theory , COLT '99, 1999. ,
DOI : 10.1145/307400.307405
Flinders Petrie, the travelling salesman problem, and the beginning of mathematical modeling in archaeology, Extra volume: Optimization stories), pp.199-210, 2012. ,
Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming, Journal of the ACM, vol.42, issue.6, pp.1115-1145, 1995. ,
DOI : 10.1145/227683.227684
URL : http://www.almaden.ibm.com/cs/people/dpw/Cut/maxcut.ps
Cauchy's method of minimization, Numerische Mathematik, vol.4, issue.3, pp.146-150, 1962. ,
DOI : 10.1007/BF01386306
Matrix Computations. Johns Hopkins Studies in the Mathematical Sciences, 2013. ,
Regret bounds for prediction problems, Proceedings of the twelfth annual conference on Computational learning theory , COLT '99, 1999. ,
DOI : 10.1145/307400.307410
URL : http://www.cs.cmu.edu/Groups/reinforcement/mosaic/talks-1999/99-09-27.paper.ps.gz
Minimum Spanning Trees and Single Linkage Cluster Analysis, Applied Statistics, vol.18, issue.1, 1969. ,
DOI : 10.2307/2346439
Graph Implementations for Nonsmooth Convex Programs, Recent Advances in Learning and Control, Lecture Notes in Control and Information Sciences, pp.95-110, 2008. ,
DOI : 10.1007/978-1-84800-155-8_7
URL : http://www.stanford.edu/~boyd/papers/pdf/graph_dcp.pdf
, CVX: Matlab Software for Disciplined Convex Programming, 2014.
A convex relaxation for weakly supervised relation extraction, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014. ,
DOI : 10.3115/v1/D14-1166
URL : https://hal.archives-ouvertes.fr/hal-01080310
Smoothing Spline ANOVA Models, 2013. ,
On the Averaged Stochastic Approximation for Linear Regression, SIAM Journal on Control and Optimization, vol.34, issue.1, pp.31-61, 1996. ,
DOI : 10.1137/S0363012992226661
A Distribution-Free Theory of Nonparametric Regression, 2006. ,
DOI : 10.1007/b97848
Why random reshuffling beats stochastic gradient descent. arXiv preprint, 2015. ,
???ber die Anwendung der Methode von Ljapunov auf Differenzengleichungen, Mathematische Annalen, vol.63, issue.5, pp.430-441, 1958. ,
DOI : 10.1007/BF01347793
Quelques questions de la th??orie de la stabilit?? pour les syst??mes aux diff??rences finies, Archive for Rational Mechanics and Analysis, vol.64, issue.No. 2, pp.150-154, 1963. ,
DOI : 10.1115/1.3662605
Approximation to Bayes risk in repeated play, Contributions to the Theory of Games, pp.97-139, 1957. ,
DOI : 10.1515/9781400882151-006
On the uniform convexity of Lp and lp, Arkiv f??r Matematik, vol.3, issue.3, pp.239-244, 1956. ,
DOI : 10.1007/BF02589410
The Elements of Statistical Learning, 2009. ,
The convex optimization approach to regret minimization. Optimization for Machine Learning, pp.287-303, 2012. ,
Extracting certainty from uncertainty: regret bounded by??variation in??costs, Machine Learning, vol.56, issue.2, 2010. ,
DOI : 10.1007/s10994-010-5175-x
URL : https://link.springer.com/content/pdf/10.1007%2Fs10994-010-5175-x.pdf
A non-generative framework and convex relaxations for unsupervised learning, Advances in Neural Information Processing Systems (NIPS), 2016. ,
Logarithmic regret algorithms for online convex optimization, Mach. Learn, vol.69, issue.2-3, 2007. ,
DOI : 10.1007/11776420_37
URL : http://www.cs.princeton.edu/~satyen/papers/HKKA2006.pdf
Methods of conjugate gradients for solving linear systems, Journal of Research of the National Bureau of Standards, vol.49, issue.6, pp.409-436, 1952. ,
DOI : 10.6028/jres.049.044
URL : http://doi.org/10.6028/jres.049.044
Fundamentals of Convex Analysis ,
DOI : 10.1007/978-3-642-56468-0
, Grundlehren Text Editions, 2001.
Application of ridge analysis to regression problems, Chemical Engineering Progress, vol.58, issue.3, pp.54-59, 1962. ,
Ridge Regression: Biased Estimation for Nonorthogonal Problems, Technometrics, vol.24, issue.1, pp.55-67, 1970. ,
DOI : 10.2307/1909769
A tail inequality for quadratic forms of subgaussian random vectors, Electronic Communications in Probability, vol.17, issue.0, 2012. ,
DOI : 10.1214/ECP.v17-2079
Random Design Analysis of Ridge Regression, Foundations of Computational Mathematics, vol.17, issue.36, pp.569-600, 2014. ,
DOI : 10.1162/0899766054323008
Accelerated gradient methods for stochastic optimization and online learning, Advances in Neural Information Processing Systems (NIPS), 2009. ,
Maximin separation probability clustering, Proceedings of the AAAI Conference on Artificial Intelligence, 2015. ,
Independent Component Analysis, 2004. ,
Parallelizing stochastic approximation through mini-batching and tail-averaging. arXiv preprint, 2016. ,
Accelerating stochastic gradient descent, 2017. ,
Provable efficient online matrix completion via non-convex stochastic gradient descent, Advances in Neural Information Processing Systems (NIPS), 2016. ,
Accelerating stochastic gradient descent using predictive variance reduction, Advances in Neural Information Processing Systems (NIPS), p.269, 2013. ,
A convex relaxation for weakly supervised classifiers, Proceedings of the Conference on Machine Learning (ICML), 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00717450
Discriminative clustering for image co-segmentation, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010. ,
DOI : 10.1109/CVPR.2010.5539868
URL : http://www.di.ens.fr/%7Efbach/cosegmentation_cvpr2010.pdf
Efficient optimization for discriminative latent class models, Advances in Neural Information Processing Systems (NIPS), 2010. ,
Low-Rank Optimization on the Cone of Positive Semidefinite Matrices, SIAM Journal on Optimization, vol.20, issue.5, 2010. ,
DOI : 10.1137/080731359
Functional aggregation for nonparametric regression, Ann. Statist, vol.28, issue.3, pp.681-712, 2000. ,
Efficient algorithms for online decision problems, Journal of Computer and System Sciences, vol.71, issue.3, pp.291-307, 2005. ,
DOI : 10.1016/j.jcss.2004.10.016
URL : http://www-math.mit.edu/~vempala/papers/online.ps
Efficiently learning mixtures of two Gaussians, Proceedings of the 42nd ACM symposium on Theory of computing, STOC '10, 2010. ,
DOI : 10.1145/1806689.1806765
URL : http://people.csail.mit.edu/moitra/docs/2g-full.pdf
LYAPUNOV FUNCTIONS FOR THE PROBLEM OF LUR'E IN AUTOMATIC CONTROL, Proc. Nat. Acad. Sci. U.S.A, pp.201-205, 1963. ,
DOI : 10.1073/pnas.49.2.201
Control System Analysis and Design Via the ???Second Method??? of Lyapunov: II???Discrete-Time Systems, Journal of Basic Engineering, vol.82, issue.2, pp.394-400, 1960. ,
DOI : 10.1115/1.3662605
On an effective method of solving extremal problems for quadratic functionals, Dokl. Akad. Nauk SSSR, vol.48, pp.455-460, 1945. ,
An algorithm for finding a circuit of even length in a directed graph, International Journal of Systems Science, vol.10, issue.11, pp.1197-1201, 1984. ,
DOI : 10.1137/0210062
Reducibility among combinatorial problems, Complexity of Computer Computations, pp.85-103, 1972. ,
DOI : 10.1007/978-3-540-68279-0_8
A statistical approach to Flinders Petrie's sequence-dating, Bull. Inst. Internat. Statist, vol.40, pp.657-681, 1963. ,
Incidence matrices, interval graphs and seriation in archeology, Pacific Journal of Mathematics, vol.28, issue.3, pp.565-570, 1969. ,
DOI : 10.2140/pjm.1969.28.565
URL : http://msp.org/pjm/1969/28-3/pjm-v28-n3-p08-s.pdf
A Mathematical Approach to Seriation, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, vol.269, issue.1193, pp.125-134, 1193. ,
DOI : 10.1098/rsta.1970.0091
Abundance matrices and seriation in archaeology, Zeitschrift f???r Wahrscheinlichkeitstheorie und Verwandte Gebiete, vol.1, issue.2, pp.104-112, 1971. ,
DOI : 10.1007/BF00538862
Polynomial algorithms in linear programming, USSR Computational Mathematics and Mathematical Physics, vol.20, issue.1, pp.1093-1096, 1979. ,
DOI : 10.1016/0041-5553(80)90061-0
Adam: A method for stochastic optimization, Proceedings of the international conference on learning representations (ICLR), 2015. ,
Exponentiated Gradient versus Gradient Descent for Linear Predictors, Information and Computation, vol.132, issue.1, pp.1-63, 1997. ,
DOI : 10.1006/inco.1996.2612
URL : https://doi.org/10.1006/inco.1996.2612
Proximal Minimization Methods with Generalized Bregman Functions, SIAM Journal on Control and Optimization, vol.35, issue.4, pp.1142-1168, 1997. ,
DOI : 10.1137/S0363012995281742
The Sample Average Approximation Method for Stochastic Discrete Optimization, SIAM Journal on Optimization, vol.12, issue.2, 2002. ,
DOI : 10.1137/S1052623499363220
Unimodal regression using Bernstein-Schoenberg splines and penalties, Biometrics, vol.67, issue.4, p.2014 ,
DOI : 10.1111/j.1541-0420.2011.01620.x