C. Kesselman and I. Foster, « The Grid: Blueprint for a New Computing Infrastructure, 1998.

A. Avizienis, J. Laprie, and B. Randell, Fundamental Concepts of Dependability, Research Report N°1145, LAAS-CNRS, 2001.

J. Hélary and A. Mostefaoui, Michel Raynal, « Déterminer un état global dans un système réparti », Annales des Télécommunications, vol.49, pp.7-8

R. Strom and S. Yemini, Optimistic recovery in distributed systems, ACM Transactions on Computer Systems, vol.3, issue.3, pp.204-226, 1985.
DOI : 10.1145/3959.3962

E. N. Mootaz, L. Elnozahy, Y. Alvisi, D. B. Wang, and . Johnson, « A Survey of Rollback-Recovery Protocols in Message-Passing Systems, ACM Computing Surveys, vol.34, issue.3, pp.375-408, 2002.

D. B. Johnson and W. Zwaenepoel, « Sender based message logging, Digest of Papers, FTCS-17 The Seventeenth Annual International Symposium on Fault- Tolerant Computing, pp.14-19, 1987.

L. Chandy, Distributed snapshots: determining global states of distributed systems, ACM Transactions on Computer Systems, vol.3, issue.1, pp.63-75, 1985.
DOI : 10.1145/214451.214456

T. Koo, Checkpointing and Rollback-Recovery for Distributed Systems, Proceedings of 1986 ACM Fall joint computer conference, ACM '86, pp.1150-1158, 1986.
DOI : 10.1109/TSE.1987.232562

J. Cristian, A timestamp-based checkpointing protocol for long-lived distributed computations, [1991] Proceedings Tenth Symposium on Reliable Distributed Systems, pp.12-20, 1991.
DOI : 10.1109/RELDIS.1991.145399

K. Tong and T. , Rollback recovery in distributed systems using loosely synchronized clocks, IEEE Transactions on Parallel and Distributed Systems, vol.3, issue.2, pp.246-251, 1992.
DOI : 10.1109/71.127264

L. Alvisi and K. Marzullo, Message logging: pessimistic, optimistic, and causal, Proceedings of 15th International Conference on Distributed Computing Systems, 1995.
DOI : 10.1109/ICDCS.1995.500024

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.11.9997

C. Delbé and «. , Tolérance aux pannes pour objets actifs asynchrones -protocole, modèle et expérimentations » PhD, Spécialité Informatique, 2007.

G. Stellner and . Cocheck, Checkpointing and Process Migration for MPI, Proceedings of the International Parallel Processing Symposium, pp.526-531, 1996.

K. Bhatia, K. Marzullo, and L. Alvisi, « Scalable causal Message Logging for Wide- Area Environments ». Concurrency and Computation: Practice and Experience, pp.873-889, 2003.

S. Monnet, C. Morin, and R. Badrinath, Hybrid checkpointing for parallel applications in cluster federations, IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004., pp.773-782, 2004.
DOI : 10.1109/CCGrid.2004.1336712

URL : https://hal.archives-ouvertes.fr/inria-00000991

E. Meneses, C. L. Mendes, and L. V. Kale, Team-Based Message Logging: Preliminary Results, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, 2010.
DOI : 10.1109/CCGRID.2010.110

URL : http://charm.cs.illinois.edu/newPapers/10-02/paper.pdf

J. Yang, K. F. Li, W. Li, and D. Zhang, Trading off logging overhead and coordinating overhead to achieve efficient rollback recovery, Concurrency and Computation : Practice and Experience, pp.819-853, 2009.
DOI : 10.1002/cpe.1364

O. T. Ropars and C. Morin, O2P : un protocole à enregistrement de messages extrêmement optimiste, Rencontres Francophones du Parallèlisme (RenPar18), 2008.

E. Elnozahy, «. Zwaenepoel, and . Manetho, Transparent rollbackrecovery with low overhead, limited rollback and fast output, IEEE Transactions on Computers, vol.41, issue.5, 1992.

P. Lemarinier, A. Bouteiller, T. Herault, G. Krawezik, and F. Cappello, Impact of Event Logger on Causal Message Logging Protocols for Fault Tolerant MPI, Proceedings of the Int Parallel and Distributed Processing Symposium (IPDPS 05), 2005.

T. Ropars and C. Morin, Improving Message Logging Protocols Scalability through Distributed Event Logging, Proceedings of the 16th international Euro- Par conference on Parallel processing : Part I, EuroPar'10, pp.511-522, 2010.
DOI : 10.1007/978-3-642-15277-1_49

URL : https://hal.archives-ouvertes.fr/inria-00526097

A. Bouteiller, F. Cappello, T. Herault, G. Krawezik, P. Lemarinier et al., MPICH-V2, Proceedings of the 2003 ACM/IEEE conference on Supercomputing, SC '03, p.25, 2003.
DOI : 10.1145/1048935.1050176

S. Jafar and «. , Programmation des systèmes parallèles distribués : tolérance aux pannes, résilience et adaptabilité, PhD. Spécialité Informatique : Systèmes et Logiciels, 2006.

D. Briatico, A. Ciuffoletti, and L. Simoncini, « A distributed domino-effect free recovery algorithm, IEEE International Symposium on Reliability, Distributed Software, and Data-bases, 1984.

H. B. Robert, J. Netzer, and . Xu, « Necessary and sufficient conditions for consistent global snapshots, IEEE Transactions on Parallel and Distributed systems, vol.6, issue.2, 1995.

N. M. Ndiaye, P. Sens, and O. Thiare, Performance comparison of hierarchical checkpoint protocols grid, International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI), vol.5, pp.46-53, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00737201

N. M. Ndiaye, P. Sens, and O. Thiare, Hierarchical composition of coordinated checkpoint with pessimistic message logging, 2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing, 2012.
DOI : 10.1109/PDGC.2012.6449916

URL : https://hal.archives-ouvertes.fr/hal-01282452