Skip to Main content Skip to Navigation
Theses

Contribution to the Emergence of New Intelligent Parallel and Distributed Methods Using a Multi-level Programming Paradigm for Extreme Computing

Abstract : Krylov iterative methods are frequently used on High-Performance Computing (HPC) systems to solve the extremely large sparse linear systems and eigenvalue problems from science and engineering fields. With the increase of both number of computing units and the heterogeneity of supercomputers, time spent in the global communication and synchronization severely damage the parallel performance of iterative methods. Programming on supercomputers tends to become distributed and parallel. Algorithm development should consider the principles: 1) multi-granularity parallelism; 2) hierarchical memory; 3) minimization of global communication; 4) promotion of the asynchronicity; 5) proposition of multi-level scheduling strategies and manager engines to handle huge traffic and improve the fault tolerance. In response to these goals, we present a distributed and parallel multi-level programming paradigm for Krylov methods on HPC platforms. The first part of our work focuses on an implementation of a scalable matrix generator to create test matrices with customised eigenvalue for benchmarking iterative methods on supercomputers. In the second part, we aim to study the numerical and parallel performance of proposed distributed and parallel iterative method. Its implementation with a manager engine and runtime can handle the huge communication traffic, fault tolerance, and reusability. In the third part, an auto-tuning scheme is introduced for the smart selection of its parameters at runtime. Finally, we analyse the possibility to implement the distributed and parallel paradigm by a graph-based workflow runtime environment.
Complete list of metadatas

https://tel.archives-ouvertes.fr/tel-02446813
Contributor : Xinzhe Wu <>
Submitted on : Tuesday, January 21, 2020 - 10:47:20 AM
Last modification on : Tuesday, September 29, 2020 - 12:24:07 PM
Long-term archiving on: : Wednesday, April 22, 2020 - 4:56:04 PM

File

50376-2019-Xinzhe_Wu.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-02446813, version 1

Citation

Xinzhe Wu. Contribution to the Emergence of New Intelligent Parallel and Distributed Methods Using a Multi-level Programming Paradigm for Extreme Computing. Distributed, Parallel, and Cluster Computing [cs.DC]. Université de Lille 1, Sciences et Technologies; CRIStAL UMR 9189, 2019. English. ⟨tel-02446813⟩

Share

Metrics

Record views

94

Files downloads

212