Towards Reproducible, Accurately Rounded and Efficient BLAS

Chemseddine Chohra 1
1 DALI - Digits, Architectures et Logiciels Informatiques
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, UPVD - Université de Perpignan Via Domitia
Abstract : Numerical reproducibility failures rise in parallel computation because floating-point summation is non-associative. Massively parallel systems dynamically modify the order of floating-point operations. Hence, numerical results might change from one run to another. We propose to ensure reproducibility by extending as far as possible the IEEE-754 correct rounding property to larger computing sequences. We introduce RARE-BLAS a reproducible and accurate BLAS library that benefits from recent accurate and efficient summation algorithms. Solutions for level 1 (asum, dot and nrm2) and level 2 (gemv and trsv) routines are designed. Implementations relying on parallel programming API (OpenMP, MPI) and SIMD extensions are proposed. Their efficiency is studied compared to optimized library (Intel MKL) and other existing reproducible algorithms.
Complete list of metadatas

https://tel.archives-ouvertes.fr/tel-02025855
Contributor : Chemseddine Chohra <>
Submitted on : Tuesday, February 19, 2019 - 8:50:31 PM
Last modification on : Friday, May 17, 2019 - 11:42:16 AM
Long-term archiving on : Monday, May 20, 2019 - 5:53:49 PM

File

These Finale.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-02025855, version 1

Collections

Citation

Chemseddine Chohra. Towards Reproducible, Accurately Rounded and Efficient BLAS. Computer Arithmetic. Université de Perpignan Via Domitia (UPVD), 2017. English. ⟨tel-02025855⟩

Share

Metrics

Record views

51

Files downloads

35