Skip to Main content Skip to Navigation
Theses

Exploration of parallel graph-processing algorithms on distributed architectures

Abstract : With the advent of ever-increasing graph datasets in a large number of domains, parallel graph-processing applications deployed on distributed architectures are more and more needed to cope with the growing demand for memory and compute resources. Though large-scale distributed architectures are available, notably in the High-Performance Computing (HPC) domain, the programming and deployment complexity of such graphprocessing algorithms, whose parallelization and complexity are highly data-dependent, hamper usability. Moreover, the difficult evaluation of performance behaviors of these applications complexifies the assessment of the relevance of the used architecture. With this in mind, this thesis work deals with the exploration of graph-processing algorithms on distributed architectures, notably using GraphLab, a state of the art graphprocessing framework. Two use-cases are considered. For each, a parallel implementation is proposed and deployed on several distributed architectures of varying scales. This study highlights operating ranges, which can eventually be leveraged to appropriately select a relevant operating point with respect to the datasets processed and used cluster nodes. Further study enables a performance comparison of commodity cluster architectures and higher-end compute servers using the two use-cases previously developed. This study highlights the particular relevance of using clustered commodity workstations, which are considerably cheaper and simpler with respect to node architecture, over higher-end systems in this applicative context. Then, this thesis work explores how performance studies are helpful in cluster design for graph-processing. In particular, studying throughput performances of a graph-processing system gives fruitful insights for further node architecture improvements. Moreover, this work shows that a more in-depth performance analysis can lead to guidelines for the appropriate sizing of a cluster for a given workload, paving the way toward resource allocation for graph-processing. Finally, hardware improvements for next generations of graph-processing servers areproposed and evaluated. A flash-based victim-swap mechanism is proposed for the mitigation of unwanted overloaded operations. Then, the relevance of ARM-based microservers for graph-processing is investigated with a port of GraphLab on a NVIDIA TX2-based architecture.
Complete list of metadatas

Cited literature [96 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01800156
Contributor : Abes Star :  Contact
Submitted on : Friday, May 25, 2018 - 3:40:06 PM
Last modification on : Tuesday, May 28, 2019 - 4:24:27 PM
Document(s) archivé(s) le : Sunday, August 26, 2018 - 2:01:42 PM

File

These_UTC_Julien_Collet.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01800156, version 1

Collections

Citation

Julien Collet. Exploration of parallel graph-processing algorithms on distributed architectures. Other [cs.OH]. Université de Technologie de Compiègne, 2017. English. ⟨NNT : 2017COMP2391⟩. ⟨tel-01800156⟩

Share

Metrics

Record views

347

Files downloads

294