Skip to Main content Skip to Navigation

Tolérance aux pannes pour objets actifs asynchrones : modèle, protocole et expérimentations

Delbé Christian 1
1 OASIS - Active objects, semantics, Internet and security
CRISAM - Inria Sophia Antipolis - Méditerranée , Laboratoire I3S - COMRED - COMmunications, Réseaux, systèmes Embarqués et Distribués
Abstract : The main goal of this thesis is to define a rollback-recovery fault tolerance protocol for the asynchronous communicating active objects model ASP (Asynchronous Sequential Processes), and its Java implementation ProActive. This work generalises the problem raised by the development of this protocol: we study the recovery of a distributed execution from an inconsistent global state. We then propose a checkpointing protocol and its implementation that does not rely on consistent global states. We demonstrate the model efficiency through realistic experiments using communicating distributed applications that this solution is efficient in practice. Another more general contribution to the problematic of recovering from a inconsistent global state by formally is the definition of the P-consistency, a new recoverability condition based on the concept of promised event. This definition is part of an event-based formalism which can be applied to any system. In particular, by applying this formalism to the ASP model, we are able to prove the correctness of our protocol by showing that every global state created during the execution is a recoverable state. Finally, we propose an extension of our protocol and an implementation adapted to the context of grid computing. This extension relies on the constitution of recovery groups during the deployment of the application. It allows to independently distribute stable storage and to limit the effects of a failure to the concerned group.
Complete list of metadatas

Cited literature [90 references]  Display  Hide  Download
Contributor : Estelle Nivault <>
Submitted on : Friday, January 18, 2008 - 4:12:19 PM
Last modification on : Monday, October 12, 2020 - 10:30:21 AM
Long-term archiving on: : Tuesday, April 13, 2010 - 11:11:51 PM


  • HAL Id : tel-00207953, version 1



Delbé Christian. Tolérance aux pannes pour objets actifs asynchrones : modèle, protocole et expérimentations. Réseaux et télécommunications [cs.NI]. Université Nice Sophia Antipolis, 2007. Français. ⟨tel-00207953⟩



Record views


Files downloads