New Algorithms for Large-Scale Support Vector Machines

Abstract : Internet as well as all the modern media of communication, information and entertainment entails a massive increase of digital data quantities. In various domains ranging from network security, information retrieval, to online advertisement, or computational linguistics automatic methods are needed to organize, classify or transform terabytes of numerical items. Machine learning research concerns the design and development of algorithms that allow computers to learn based on data. A large number of accurate and efficient learning algorithms now exist and it seems rewarding to use them to automate more and more complex tasks, especially when humans have difficulties to handle large amounts of data. Unfortunately, most learning algorithms performs well on small databases but cannot be trained on large data quantities. Hence, there is a deep need for machine learning methods able to learn with millions of training instances so that they could enjoy the huge available data sources. We develop these issues in our introduction, in Chapter 1. In this thesis, we propose solutions to reduce training time and memory requirements of learning algorithms while keeping strong performances in accuracy. In particular, among all the machine learning models, we focus on Support Vector Machines (SVMs) that are standard methods mostly used for automatic classification. We extensively describe them in Chapter 2 Throughout this dissertation, we propose different original algorithms for learning SVMs, depending on the final task they are destined to. First, in Chapter 3, we study the learning process of Stochastic Gradient Descent for the particular case of linear SVMs. This leads us to define and validate the new SGD-QN algorithm. Then we introduce a brand new learning principle: the Process/Reprocess strategy. We present three algorithms implementing it. The Huller and LaSVM are discussed in Chapter 4. They are designed towards training SVMs for binary classification. For the more complex task of structured output prediction, we refine intensively LaSVM: this results in the LaRank algorithm which is detailed in Chapter 5. Finally, in Chapter 6 is introduced the original framework of learning under ambiguous supervision which we apply to the task of semantic parsing of natural language. Each algorithm introduced in this thesis achieves state-of-the-art performances, especially in terms of training speed. Almost all of them have been published in international peer-reviewed journals or conference proceedings. Corresponding implementations have also been released. As much as possible, we always keep the description of our innovative methods as generic as possible because we want to ease the design of any further derivation. Indeed, many directions can be followed to carry on with what we present in this dissertation. We list some of them in Chapter 7.
Document type :
Theses
Complete list of metadatas

Cited literature [133 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00464007
Contributor : Antoine Bordes <>
Submitted on : Monday, March 15, 2010 - 4:40:52 PM
Last modification on : Friday, March 22, 2019 - 1:31:04 AM
Long-term archiving on : Friday, October 19, 2012 - 9:50:32 AM

Identifiers

  • HAL Id : tel-00464007, version 1

Citation

Antoine Bordes. New Algorithms for Large-Scale Support Vector Machines. Computer Science [cs]. Université Pierre et Marie Curie - Paris VI, 2010. English. ⟨tel-00464007⟩

Share

Metrics

Record views

518

Files downloads

1525