Skip to Main content Skip to Navigation

Solving dense linear systems on accelerated multicore architectures

Adrien Rémy 1
1 ParSys - LRI - Systèmes parallèles (LRI)
LRI - Laboratoire de Recherche en Informatique
Abstract : In this PhD thesis, we study algorithms and implementations to accelerate the solution of dense linear systems by using hybrid architectures with multicore processors and accelerators. We focus on methods based on the LU factorization and our code development takes place in the context of the MAGMA library. We study different hybrid CPU/GPU solvers based on the LU factorization which aim at reducing the communication overhead due to pivoting. The first one is based on a communication avoiding strategy of pivoting (CALU) while the second uses a random preconditioning of the original system to avoid pivoting (RBT). We show that both of these methods outperform the solver using LU factorization with partial pivoting when implemented on hybrid multicore/GPUs architectures. We also present new solvers based on randomization for hybrid architectures for Nvidia GPU or Intel Xeon Phi coprocessor. With this method, we can avoid the high cost of pivoting while remaining numerically stable in most cases. The highly parallel architecture of these accelerators allow us to perform the randomization of our linear system at a very low computational cost compared to the time of the factorization. Finally we investigate the impact of non-uniform memory accesses (NUMA) on the solution of dense general linear systems using an LU factorization algorithm. In particular we illustrate how an appropriate placement of the threads and data on a NUMA architecture can improve the performance of the panel factorization and consequently accelerate the global LU factorization. We show how these placements can improve the performance when applied to hybrid multicore/GPU solvers.
Complete list of metadatas

Cited literature [43 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Friday, November 6, 2015 - 4:22:06 PM
Last modification on : Friday, April 10, 2020 - 2:11:52 AM
Document(s) archivé(s) le : Monday, February 8, 2016 - 1:00:58 PM


  • HAL Id : tel-01225745, version 1


Adrien Rémy. Solving dense linear systems on accelerated multicore architectures. Hardware Architecture [cs.AR]. Université Paris Sud - Paris XI, 2015. English. ⟨NNT : 2015PA112138⟩. ⟨tel-01225745⟩



Record views


Files downloads