Skip to Main content Skip to Navigation

Machine learning based on Hawkes processes and stochastic optimization

Abstract : The common thread of this thesis is the study of Hawkes processes. These point processes decrypt the cross-causality that occurs across several event series. Namely, they retrieve the influence that the events of one series have on the future events of all series. For example, in the context of social networks, they describe how likely an action of a certain user (such as a Tweet) will trigger reactions from the others.The first chapter consists in a general introduction on point processes followed by a focus on Hawkes processes and more specifically on the properties of the widely used exponential kernels parametrization. In the following chapter, we introduce an adaptive penalization technique to model, with Hawkes processes, the information propagation on social networks. This penalization is able to take into account the prior knowledge on the social network characteristics, such as the sparse interactions between users or the community structure, to reflect them on the estimated model. Our technique uses data-driven weighted penalties induced by a careful analysis of the generalization error.Next, we focus on convex optimization and recall the recent progresses made with stochastic first order methods using variance reduction techniques. The fourth chapter is dedicated to an adaptation of these techniques to optimize the most commonly used goodness-of-fit of Hawkes processes. Indeed, this goodness-of-fit does not meet the gradient-Lipschitz assumption that is required by the latest first order methods. Thus, we work under another smoothness assumption, and obtain a linear convergence rate for a shifted version of Stochastic Dual Coordinate Ascent that improves the current state-of-the-art. Besides, such objectives include many linear constraints that are easily violated by classic first order algorithms, but in the Fenchel-dual problem these constraints are easier to deal with. Hence, our algorithm's robustness is comparable to second order methods that are very expensive in high dimensions.Finally, the last chapter introduces a new statistical learning library for Python 3 with a particular emphasis on time-dependent models, tools for generalized linear models and survival analysis. Called tick, this library relies on a C++ implementation and state-of-the-art optimization algorithms to provide very fast computations in a single node multi-core setting. Open-sourced and published on Github, this library has been used all along this thesis to perform benchmarks and experiments.
Complete list of metadatas

Cited literature [114 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-02316143
Contributor : Abes Star :  Contact
Submitted on : Tuesday, October 15, 2019 - 10:17:40 AM
Last modification on : Monday, February 3, 2020 - 1:28:06 AM
Document(s) archivé(s) le : Thursday, January 16, 2020 - 2:27:07 PM

File

71053_BOMPAIRE_2019_archivage....
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-02316143, version 1

Collections

Citation

Martin Bompaire. Machine learning based on Hawkes processes and stochastic optimization. Other Statistics [stat.ML]. Université Paris-Saclay, 2019. English. ⟨NNT : 2019SACLX030⟩. ⟨tel-02316143⟩

Share

Metrics

Record views

152

Files downloads

155