Skip to Main content Skip to Navigation
Habilitation à diriger des recherches

Observation d'exécutions parallèles

Abstract : The aim of the research work presented in this document is the design and implementation of tools to help programmers both for correctness and performance debugging of parallel applications, running on medium to large scale clusters of symmetric multi-processors nodes. Parallel programs are executed by a dynamically evolving network of communicating threads; within the same node, threads communicate through shared memory while threads belonging to different nodes use message passing communications. This work addresses mainly two problems. First of all, the identification of transient errors, arising from the non determinism of the programming model, is addressed by adapting the execution replay techniques to the communicating threads programming model. The other problem is the complexity of the dynamic behavior of parallel program executions, which makes them difficult to understand in order to find their errors. An interactive, scalable and extensible visualization tool, based on execution trace analysis, helps understanding the dynamic behavior of communicating threads parallel programs. Several issues raised by the integration of these tools into a coherent debugging environment are then pointed out and possible solutions are sketched.
Document type :
Habilitation à diriger des recherches
Complete list of metadatas

Cited literature [102 references]  Display  Hide  Download
Contributor : Thèses Imag <>
Submitted on : Tuesday, February 17, 2004 - 11:16:16 AM
Last modification on : Friday, November 6, 2020 - 4:39:41 AM
Long-term archiving on: : Friday, February 11, 2011 - 4:30:24 PM


  • HAL Id : tel-00004711, version 1




Jacques Chassin de Kergommeaux. Observation d'exécutions parallèles. Autre [cs.OH]. Institut National Polytechnique de Grenoble - INPG, 2000. ⟨tel-00004711⟩



Record views


Files downloads