Skip to Main content Skip to Navigation

Sur deux problèmes d’apprentissage automatique : la détection de communautés et l’appariement adaptatif

Abstract : In this thesis, we study two problems of machine learning: (I) community detection and (II) adaptive matching. I) It is well-known that many networks exhibit a community structure. Finding those communities helps us understand and exploit general networks. In this thesis we focus on community detection using so-called spectral methods based on the eigenvectors of carefully chosen matrices. We analyse their performance on artificially generated benchmark graphs. Instead of the classical Stochastic Block Model (which does not allow for much degree-heterogeneity), we consider a Degree-Corrected Stochastic Block Model (DC-SBM) with weighted vertices, that is able to generate a wide class of degree sequences. We consider this model in both a dense and sparse regime. In the dense regime, we show that an algorithm based on a suitably normalized adjacency matrix correctly classifies all but a vanishing fraction of the nodes. In the sparse regime, we show that the availability of only a small amount of information entails the existence of an information-theoretic threshold below which no algorithm performs better than random guess. On the positive side, we show that an algorithm based on the non-backtracking matrix works all the way down to the detectability threshold in the sparse regime, showing the robustness of the algorithm. This follows after a precise characterization of the non-backtracking spectrum of sparse DC-SBM's. We further perform tests on well-known real networks. II) Online two-sided matching markets such as Q&A forums and online labour platforms critically rely on the ability to propose adequate matches based on imperfect knowledge of the two parties to be matched. We develop a model of a task / server matching system for (efficient) platform operation in the presence of such uncertainty. For this model, we give a necessary and sufficient condition for an incoming stream of tasks to be manageable by the system. We further identify a so-called back-pressure policy under which the throughput that the system can handle is optimized. We show that this policy achieves strictly larger throughput than a natural greedy policy. Finally, we validate our model and confirm our theoretical findings with experiments based on user-contributed content on an online platform.
Complete list of metadata

Cited literature [128 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Wednesday, July 11, 2018 - 10:00:13 AM
Last modification on : Thursday, July 1, 2021 - 5:58:03 PM
Long-term archiving on: : Friday, October 12, 2018 - 2:15:03 PM


Version validated by the jury (STAR)


  • HAL Id : tel-01834967, version 1



Lennart Gulikers. Sur deux problèmes d’apprentissage automatique : la détection de communautés et l’appariement adaptatif. Machine Learning [stat.ML]. Université Paris sciences et lettres, 2017. Français. ⟨NNT : 2017PSLEE062⟩. ⟨tel-01834967⟩



Record views


Files downloads