Déploiement et contrôle d'applications parallèles sur grappes de grandes tailles

Abstract : The increasing size of cluster of workstations sets down the scalability problem of applications running on these platforms. This concerns both numerical parallel applications and exploitation tools (administration, monitoring...). In this thesis work, we study the deployment of parallel applications on large clusters, that can be extended to grids. The deployment includes on one hand the launch of the parallel program on all nodes and on the other hand the setting up of a communication layer. Efficiency is obtained thanks to the overlay of all independent steps of the deployment. This work shows this problem as equivalent as the well known problem of the single message broadcast. Performance gap between the cost of a network communication and this of a remote execution call enable us to use a work stealing algorithm to realize a near-optimal schedule of remote execution calls. The good properties and performance figures of this tool, Taktuk, are demonstrated by its use in several projects like: KaTools (included and used by the Clic Mandrake Cluster Linux distribution), OAR (Job manager) and Inuktitut (Communication layer of the environment ATHAPASCAN).
Document type :
Theses
Networking and Internet Architecture [cs.NI]. Institut National Polytechnique de Grenoble - INPG, 2003. French


https://tel.archives-ouvertes.fr/tel-00004610
Contributor : Cyrille Martin <>
Submitted on : Tuesday, February 10, 2004 - 9:48:39 AM
Last modification on : Tuesday, February 10, 2004 - 9:48:39 AM

Identifiers

  • HAL Id : tel-00004610, version 1

Collections

INRIA | IMAG | UGA

Citation

Cyrille Martin. Déploiement et contrôle d'applications parallèles sur grappes de grandes tailles. Networking and Internet Architecture [cs.NI]. Institut National Polytechnique de Grenoble - INPG, 2003. French. <tel-00004610>

Export

Share

Metrics

Consultation de
la notice

123

Téléchargement du document

224