Skip to Main content Skip to Navigation

Techniques de gestion des défaillances dans les grilles informatiques tolérantes aux fautes

Ndeye Massata Ndiaye 1
1 Regal - Large-Scale Distributed Systems and Applications
LIP6 - Laboratoire d'Informatique de Paris 6, Inria Paris-Rocquencourt
Abstract : The construction of grid computing is one of the major research on networked computer systems . The main construction of a grid computing is to provide the concepts and system software components suitable to aggregate computing resources ( processors, memory , and also network ) within a grid of data processing for make (eventually) a global IT infrastructure simulations, data processing and industrial process control . This infrastructure can potentially be used in all fields of scientific research, industrial research and operational activities ( new processes and products , instrumentation, etc. . ) , in the evolution of information systems , web and multimedia. The production quality grids assume a mastery of problems of reliability, enhanced security through better access control and better protection against attacks, fault tolerance and failure prevention, all these properties to result in grid infrastructure computer safe operation. In this thesis we propose to conduct research into the problems of automated fault management, the main objective is to hide as much as possible such failures, ultimately making them transparent to applications, so that from the point of view applications, the grid infrastructure operates almost continuously . We have developed a new self- adaptive hierarchical algorithm to ensure fault tolerance in computational grids. This protocol is based on the hierarchical architecture of grid computing. In each cluster, we defined a coordinator called leader, whose role is to coordinate intra-cluster and ensure the role of intermediary between processes belonging to different clusters process. To save the state of inter-cluster process, the adaptive protocol uses pessimistic message logging protocol based on the issuer. Inside the cluster, the protocol used depends on the frequency of messages. From a maximum threshold determined by the density of communications frequency, non-blocking coordinated checkpoint protocol is used. If the number of messages in the cluster is low , messages are saved using the pessimistic message logging protocol.
Document type :
Complete list of metadatas

Cited literature [27 references]  Display  Hide  Download
Contributor : Ndeye Ndiaye <>
Submitted on : Wednesday, January 15, 2014 - 8:45:29 PM
Last modification on : Friday, January 8, 2021 - 5:46:03 PM
Long-term archiving on: : Saturday, April 8, 2017 - 5:02:15 PM


  • HAL Id : tel-00931839, version 1


Ndeye Massata Ndiaye. Techniques de gestion des défaillances dans les grilles informatiques tolérantes aux fautes. Autre [cs.OH]. Université Pierre et Marie Curie - Paris VI, 2013. Français. ⟨tel-00931839⟩



Record views


Files downloads