« The Grid: Blueprint for a New Computing Infrastructure, 1998. ,
Fundamental Concepts of Dependability, Research Report N°1145, LAAS-CNRS, 2001. ,
Michel Raynal, « Déterminer un état global dans un système réparti », Annales des Télécommunications, vol.49, pp.7-8 ,
Optimistic recovery in distributed systems, ACM Transactions on Computer Systems, vol.3, issue.3, pp.204-226, 1985. ,
DOI : 10.1145/3959.3962
« A Survey of Rollback-Recovery Protocols in Message-Passing Systems, ACM Computing Surveys, vol.34, issue.3, pp.375-408, 2002. ,
« Sender based message logging, Digest of Papers, FTCS-17 The Seventeenth Annual International Symposium on Fault- Tolerant Computing, pp.14-19, 1987. ,
Distributed snapshots: determining global states of distributed systems, ACM Transactions on Computer Systems, vol.3, issue.1, pp.63-75, 1985. ,
DOI : 10.1145/214451.214456
Checkpointing and Rollback-Recovery for Distributed Systems, Proceedings of 1986 ACM Fall joint computer conference, ACM '86, pp.1150-1158, 1986. ,
DOI : 10.1109/TSE.1987.232562
A timestamp-based checkpointing protocol for long-lived distributed computations, [1991] Proceedings Tenth Symposium on Reliable Distributed Systems, pp.12-20, 1991. ,
DOI : 10.1109/RELDIS.1991.145399
Rollback recovery in distributed systems using loosely synchronized clocks, IEEE Transactions on Parallel and Distributed Systems, vol.3, issue.2, pp.246-251, 1992. ,
DOI : 10.1109/71.127264
Message logging: pessimistic, optimistic, and causal, Proceedings of 15th International Conference on Distributed Computing Systems, 1995. ,
DOI : 10.1109/ICDCS.1995.500024
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.11.9997
Tolérance aux pannes pour objets actifs asynchrones -protocole, modèle et expérimentations » PhD, Spécialité Informatique, 2007. ,
Checkpointing and Process Migration for MPI, Proceedings of the International Parallel Processing Symposium, pp.526-531, 1996. ,
« Scalable causal Message Logging for Wide- Area Environments ». Concurrency and Computation: Practice and Experience, pp.873-889, 2003. ,
Hybrid checkpointing for parallel applications in cluster federations, IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004., pp.773-782, 2004. ,
DOI : 10.1109/CCGrid.2004.1336712
URL : https://hal.archives-ouvertes.fr/inria-00000991
Team-Based Message Logging: Preliminary Results, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, 2010. ,
DOI : 10.1109/CCGRID.2010.110
URL : http://charm.cs.illinois.edu/newPapers/10-02/paper.pdf
Trading off logging overhead and coordinating overhead to achieve efficient rollback recovery, Concurrency and Computation : Practice and Experience, pp.819-853, 2009. ,
DOI : 10.1002/cpe.1364
O2P : un protocole à enregistrement de messages extrêmement optimiste, Rencontres Francophones du Parallèlisme (RenPar18), 2008. ,
Transparent rollbackrecovery with low overhead, limited rollback and fast output, IEEE Transactions on Computers, vol.41, issue.5, 1992. ,
Impact of Event Logger on Causal Message Logging Protocols for Fault Tolerant MPI, Proceedings of the Int Parallel and Distributed Processing Symposium (IPDPS 05), 2005. ,
Improving Message Logging Protocols Scalability through Distributed Event Logging, Proceedings of the 16th international Euro- Par conference on Parallel processing : Part I, EuroPar'10, pp.511-522, 2010. ,
DOI : 10.1007/978-3-642-15277-1_49
URL : https://hal.archives-ouvertes.fr/inria-00526097
MPICH-V2, Proceedings of the 2003 ACM/IEEE conference on Supercomputing, SC '03, p.25, 2003. ,
DOI : 10.1145/1048935.1050176
Programmation des systèmes parallèles distribués : tolérance aux pannes, résilience et adaptabilité, PhD. Spécialité Informatique : Systèmes et Logiciels, 2006. ,
« A distributed domino-effect free recovery algorithm, IEEE International Symposium on Reliability, Distributed Software, and Data-bases, 1984. ,
« Necessary and sufficient conditions for consistent global snapshots, IEEE Transactions on Parallel and Distributed systems, vol.6, issue.2, 1995. ,
Performance comparison of hierarchical checkpoint protocols grid, International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI), vol.5, pp.46-53, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00737201
Hierarchical composition of coordinated checkpoint with pessimistic message logging, 2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing, 2012. ,
DOI : 10.1109/PDGC.2012.6449916
URL : https://hal.archives-ouvertes.fr/hal-01282452