Online stochastic algorithms

Abstract : This thesis works mainly on three subjects. The first one is online clustering in which we introduce a new and adaptive stochastic algorithm to cluster online dataset. It relies on a quasi-Bayesian approach, with a dynamic (i.e., time-dependent) estimation of the (unknown and changing) number of clusters. We prove that this algorithm has a regret bound of the order of \sqrt{TlnT} and is asymptotically minimax under the constraint on the number of clusters. A RJMCMC-flavored implementation is also proposed. The second subject is related to the sequential learning of principal curves which seeks to represent a sequence of data by a continuous polygonal curve. To this aim, we introduce a procedure based on the MAP of Gibbs-posterior that can give polygonal lines whose number of segments can be chosen automatically. We also show that our procedure is supported by regret bounds with sublinear remainder terms. In addition, a greedy local search implementation that incorporates both sleeping experts and multi-armed bandit ingredients is presented. The third one concerns about the work which aims to fulfilling practical tasks within iAdvize, the company which supports this thesis. It includes sentiment analysis for textual messages by using methods in both text mining and statistics, and implementation of chatbot based on nature language processing and neural networks.
Complete list of metadatas

https://tel.archives-ouvertes.fr/tel-01970795
Contributor : Le Li <>
Submitted on : Sunday, January 6, 2019 - 10:51:14 AM
Last modification on : Friday, May 10, 2019 - 12:14:02 PM
Long-term archiving on : Sunday, April 7, 2019 - 12:25:17 PM

File

thèse_LeLI.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-01970795, version 1

Citation

Le Li. Online stochastic algorithms. Machine Learning [stat.ML]. Université d'Angers, 2018. English. ⟨tel-01970795⟩

Share

Metrics

Record views

136

Files downloads

149