Skip to Main content Skip to Navigation
Theses

Laplacian Powers for Graph-Based Semi-Supervised Learning

Esteban Bautista Ruiz 1, 2
2 DANTE - Dynamic Networks : Temporal and Structural Capture Approach
Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l'Informatique du Parallélisme, IXXI - Institut Rhône-Alpin des systèmes complexes
Abstract : Graph-Based Semi-Supervised Learning (G-SSL) techniques learn from both labelled and unla- belled data to build better classifiers. Despite successes, its performance can still be improved, particularly in cases of graphs with unclear clusters or unbalanced labelled datasets. To ad- dress such limitations, the main contribution of this dissertation is a novel method for G-SSL referred to as the Lγ -PageRank method. It consists of a generalization of the PageRank algo- rithm based on the positive γ-th powers of the graph Laplacian matrix. The theoretical study of Lγ -PageRank shows that (i) for γ < 1, it corresponds to an extension of the PageRank algo- rithm to L´evy processes: where random walkers can now perform far-distant jumps in a single step; and (ii) for γ > 1, it operates on signed graphs: where nodes belonging to one same class are more likely to share positive edges while nodes from different classes are more likely to be connected with negative edges. We show the existence of an optimal γ-th power that maximizes performance, for which a method for its automatic estimation is devised and assessed. Exper- iments on several datasets demonstrate that the L´evy flight random walkers can enhance the detection of classes with complex local structures and that the signed graphs can significantly improve the separability of data and also override the issue of unbalanced labelled data. In addition, we study efficient implementations of Lγ -PageRank. Extensions of Power Iteration and Gauss-Southwell, successful algorithms to efficiently compute the solution of the standard PageRank algorithm, are derived for Lγ -PageRank. Moreover, the dynamic versions of Power Iteration and Gauss-Southwell, which can update the solution of standard PageRank in sub- linear complexity when the graph evolves or new data arrive, are also extended to Lγ -PageRank. Lastly, we apply Lγ -PageRank in the context of Internet routing. We address the problem of identifying the Autonomous Systems (AS) of inter-AS links from the network of IP addresses and AS public registers. Experiments on tracerout measurements collected from the Internet show that Lγ -PageRank can solve this inference task with no errors, even when the expert does not provide labelled examples of all classes.
Complete list of metadatas

Cited literature [151 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-02476246
Contributor : Abes Star :  Contact
Submitted on : Wednesday, February 12, 2020 - 3:30:08 PM
Last modification on : Thursday, February 13, 2020 - 1:35:40 AM
Document(s) archivé(s) le : Wednesday, May 13, 2020 - 5:10:43 PM

File

BAUTISTA_RUIZ_Esteban_2019LYSE...
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-02476246, version 1

Citation

Esteban Bautista Ruiz. Laplacian Powers for Graph-Based Semi-Supervised Learning. Artificial Intelligence [cs.AI]. Université de Lyon, 2019. English. ⟨NNT : 2019LYSEN081⟩. ⟨tel-02476246⟩

Share

Metrics

Record views

87

Files downloads

118