De la localité logicielle à la localité matérielle sur les architectures à mémoire partagée, hétérogène et non-uniforme

Abstract : Through years, the complexity of High Performance Computing (HPC) systems’ memory hierarchy has increased. Nowadays, large scale machines typically embed several levels of caches and a distributed memory. Recently, on-chip memories and non-volatile PCIe based flash have entered the HPC landscape. This memory architecture is a necessary pain to obtain high performance, but at the cost of a thorough task and data placement. Hardware managed caches used to hide the tedious locality optimizations. Now, data locality, in local or remote memories, in fast or slow memory, in volatile or non-volatile memory, with small or wide capacity, is entirely software manageable. This extra flexibility grants more freedom to application designers but with the drawback of making their work more complex and expensive. Indeed, when managing tasks and data placement, one has to account for several complex trade-offs between memory performance, size and features. This thesis has been supervised between Atos Bull Technologies and Inria Bordeaux – Sud-Ouest. In the hereby document, we detail contemporary HPC systems and characterize machines performance for several locality scenarios. We explain how the programming language semantics affects data locality in the hardware, and thus applications performance. Through a joint work with the INESC-ID laboratory in Lisbon, we propose an insightful extension to the famous Roofline performance model in order to provide locality hints and improve applications performance. We also present a modeling framework to map platform and application performance events to the hardware topology, in order to extract synthetic locality metrics. Finally, we propose an automatic locality policy selector, on top of machine learning algorithms, to easily improve applications tasks and data placement.
Complete list of metadatas

Cited literature [39 references]  Display  Hide  Download
Contributor : Abes Star <>
Submitted on : Wednesday, January 9, 2019 - 11:29:07 AM
Last modification on : Thursday, January 24, 2019 - 1:12:49 AM


Files produced by the author(s)


  • HAL Id : tel-01917364, version 2


Nicolas Denoyelle. De la localité logicielle à la localité matérielle sur les architectures à mémoire partagée, hétérogène et non-uniforme. Calcul parallèle, distribué et partagé [cs.DC]. Université de Bordeaux, 2018. Français. ⟨NNT : 2018BORD0201⟩. ⟨tel-01917364v2⟩



Record views


Files downloads