J. Luiz-andré-barroso, U. Dean, and . Hölzle, Web search for a planet: The google cluster architecture. Micro, IEEE, vol.23, issue.22, pp.22-28, 2003.

A. Simonet, G. Fedak, and M. Ripeanu, Active Data: A programming model to manage data life cycle across heterogeneous systems and infrastructures, Future Generation Computer Systems, vol.53, pp.49-2015
DOI : 10.1016/j.future.2015.05.015

URL : https://hal.archives-ouvertes.fr/hal-01241491

G. Antoniu, J. Bigot, C. Blanchet, L. Bougé, F. Briant et al., Scalable data management for map-reduce-based data-intensive applications: a view for cloud and hybrid infrastructures, International Journal of Cloud Computing, vol.2, issue.2/3, pp.150-170
DOI : 10.1504/IJCC.2013.055265

A. Simonet, K. Chard, G. Fedak, and I. Foster, Using Active Data to Provide Smart Data Surveillance to E-Science Users, 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, p.2015
DOI : 10.1109/PDP.2015.76

URL : https://hal.archives-ouvertes.fr/hal-01256207

G. Antoniu, J. Bigot, C. Blanchet, L. Bougé, F. Briant et al., Towards Scalable Data Management for Map-Reduce-based Data-Intensive Applications on Cloud and Hybrid Infrastructures, 1st International IBM Cloud Academy Conference -ICA CON 2012, p.2012
URL : https://hal.archives-ouvertes.fr/hal-00767029

A. Simonet, G. Fedak, and M. Ripeanu, Active data, Proceedings of the 8th Parallel Data Storage Workshop on, PDSW '13, pp.39-44
DOI : 10.1145/2538542.2538566

URL : https://hal.archives-ouvertes.fr/hal-00921080

A. Simonet, Active data, Proceedings of the 8th Parallel Data Storage Workshop on, PDSW '13, p.2014
DOI : 10.1145/2538542.2538566

URL : https://hal.archives-ouvertes.fr/hal-00921080

A. Simonet, G. Fedak, and M. Ripeanu, Active data, Proceedings of the 8th Parallel Data Storage Workshop on, PDSW '13, p.2015
DOI : 10.1145/2538542.2538566

URL : https://hal.archives-ouvertes.fr/hal-00921080

M. Chen, S. Mao, and Y. Liu, Big Data: A Survey, Mobile Networks and Applications, vol.16, issue.6, pp.171-209
DOI : 10.1007/s11036-013-0489-0

V. Mayer-schönberger and K. Cukier, Big Data: A revolution that will transform how we live, work, and think, 2013.

D. Boyd and K. Crawford, CRITICAL QUESTIONS FOR BIG DATA, Information, Communication & Society, vol.5, issue.4, pp.662-679
DOI : 10.1086/517842

L. Manovich, Trending: the promises and the challenges of big social data, chapter 27, pp.7-13

L. Atzori, A. Iera, and G. Morabito, The Internet of Things: A survey, Computer Networks, vol.54, issue.15, pp.2787-2805, 2010.
DOI : 10.1016/j.comnet.2010.05.010

D. Miorandi, S. Sicari, F. D. Pellegrini, and I. Chlamtac, Internet of things: Vision, applications and research challenges, Ad Hoc Networks, pp.1497-1516
DOI : 10.1016/j.adhoc.2012.02.016

W. Gropp, E. Lusk, and T. L. Sterling, Beowulf cluster computing with Linux, 2002.

C. Guo, G. Lu, D. Li, H. Wu, X. Zhang et al., Bcube: A high performance, server-centric network architecture for modular data centers, Proceedings of the ACM SIGCOMM 2009 Conference on Data Communication, SIGCOMM '09, pp.63-74, 2009.

A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke, The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets, Journal of Network and Computer Applications, vol.23, issue.3, pp.187-200, 2000.
DOI : 10.1006/jnca.2000.0110

S. Venugopal, R. Buyya, and K. Ramamohanarao, A taxonomy of Data Grids for distributed data sharing, management, and processing, ACM Computing Surveys, vol.38, issue.1, 2006.
DOI : 10.1145/1132952.1132955

Z. Peter, L. P. Kunszt, and . Guy, The Open Grid Services Architecture, and Data Grids, pp.385-407, 2003.

K. Ranganathan and I. Foster, Identifying Dynamic Replication Strategies for a High-Performance Data Grid, Grid Computing ? GRID 2001, pp.75-86
DOI : 10.1007/3-540-45644-9_8

I. Foster, C. Kesselman, and S. Tuecke, The Anatomy of the Grid: Enabling Scalable Virtual Organizations, International Journal of High Performance Computing Applications, vol.15, issue.3, pp.200-222, 2001.
DOI : 10.1177/109434200101500302

F. Berman, G. Fox, J. Anthony, and . Hey, Grid Computing: Making the Global Infrastructure a Reality, 2003.
DOI : 10.1002/0470867167

I. Foster and C. Kesselman, The Grid 2: Blueprint for a new computing infrastructure, 2003.

W. Hoschek, J. Jaen-martinez, A. Samar, H. Stockinger, and K. Stockinger, Data Management in an International Data Grid Project, Grid Computing ? GRID 2000, pp.77-90, 1971.
DOI : 10.1007/3-540-44444-0_8

D. Bosio, J. Casey, A. Frohner, L. Guy, P. Kunszt et al., Next-generation EU DataGrid data management services ArXiv Physics e-prints, 2003.

I. Foster, Y. Zhao, I. Raicu, and S. Lu, Cloud Computing and Grid Computing 360-Degree Compared, 2008 Grid Computing Environments Workshop, pp.1-10, 2008.
DOI : 10.1109/GCE.2008.4738445

URL : http://arxiv.org/abs/0901.0131

E. Bhaskar-prasad-rimal, I. Choi, and . Lumb, A taxonomy and survey of cloud computing systems, Proceedings of the 5th International Joint Conference on INC, IMS and IDC, NCM '09, pp.44-51, 2009.

T. Dillon, C. Wu, and E. Chang, Cloud Computing: Issues and Challenges, 2010 24th IEEE International Conference on Advanced Information Networking and Applications, pp.27-33, 2010.
DOI : 10.1109/AINA.2010.187

Y. Wei and M. B. Blake, Service-Oriented Computing and Cloud Computing: Challenges and Opportunities, IEEE Internet Computing, vol.14, issue.6, pp.72-75, 2010.
DOI : 10.1109/MIC.2010.147

M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz et al., A view of cloud computing, Ion Stoica, and Matei Zaharia. A view of cloud computing, pp.50-58, 2010.
DOI : 10.1145/1721654.1721672

D. P. Anderson, J. Cobb, E. Korpela, M. Lebofsky, and D. Werthimer, SETI@home: an experiment in public-resource computing, Communications of the ACM, vol.45, issue.11, pp.56-61, 2002.
DOI : 10.1145/581571.581573

P. David and . Anderson, BOINC: a system for public-resource computing and storage, Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing, pp.4-10, 2004.

H. Abbes, C. Cerin, and M. Jemni, BonjourGrid: Orchestration of multi-instances of grid middlewares on institutional Desktop Grids, 2009 IEEE International Symposium on Parallel & Distributed Processing, pp.1-8, 2009.
DOI : 10.1109/IPDPS.2009.5161140

H. Lin, J. Archuleta, X. Ma, Z. Wu-chun-feng, M. Zhang et al., MOON, Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC '10, pp.95-106, 2010.
DOI : 10.1145/1851476.1851489

URL : https://hal.archives-ouvertes.fr/in2p3-00024076

S. Delamare, G. Fedak, D. Kondo, and O. Lodygensky, SpeQuloS, Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing, HPDC '12, pp.173-186
DOI : 10.1145/2287076.2287106

URL : https://hal.archives-ouvertes.fr/hal-00757074

S. Tran-doan-thanh, E. Mohan, S. Choi, P. Kim, and . Kim, A taxonomy and survey on distributed file systems, Proceedings of the 4th International Conference on Networked Computing and Advanced Information Management of NCM '08, pp.144-149, 2008.

K. Mckusick and S. Quinlan, GFS, Communications of the ACM, vol.53, issue.3, pp.42-49, 2010.
DOI : 10.1145/1666420.1666439

A. Sage, S. A. Weil, E. L. Brandt, D. D. Miller, C. Long et al., Ceph: A scalable, high-performance distributed file system, Proceedings of the 7th Symposium on Operating Systems Design and Implementation, OSDI'06, pp.307-320, 2006.

S. Ghemawat, H. Gobioff, and S. Leung, The google file system, Proceedings of the 19th ACM Symposium on Operating Systems Principles, SOSP'03, pp.29-43, 2003.

K. Shvachko, K. Hairong, S. Radia, and R. Chansler, The Hadoop Distributed File System, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp.1-10, 2010.
DOI : 10.1109/MSST.2010.5496972

P. Schwan, Lustre, Building a File System for 1,000-node Clusters, Proceedings of the Linux Symposium, 2003.

D. Thain, C. Moretti, and J. Hemmes, Chirp: a practical global filesystem for cluster and Grid computing, Journal of Grid Computing, vol.14, issue.1, pp.51-72, 2009.
DOI : 10.1007/s10723-008-9100-5

B. Nicolae, G. Antoniu, L. Bougé, D. Moise, and A. Carpen-amarie, BlobSeer: Next-generation data management for large scale infrastructures, Journal of Parallel and Distributed Computing, vol.71, issue.2, pp.169-184, 2011.
DOI : 10.1016/j.jpdc.2010.08.004

URL : https://hal.archives-ouvertes.fr/inria-00511414

S. Al-kiswany, A. Gharaibeh, and M. Ripeanu, The case for a versatile storage system, ACM SIGOPS Operating Systems Review, vol.44, issue.1, pp.10-14, 2010.
DOI : 10.1145/1740390.1740394

F. Schmuck and R. Haskin, Gpfs: A shared-disk file system for large computing clusters, Proceedings of the 1st USENIX Conference on File and Storage Technologies, FAST'02, 2002.

L. B. Costa, H. Yang, E. Vairavanathan, A. Barros, K. Maheshwari et al., The Case for Workflow-Aware Storage:An Opportunity Study, Journal of Grid Computing, vol.215, issue.1, pp.95-113
DOI : 10.1007/s10723-014-9307-6

URL : https://hal.archives-ouvertes.fr/hal-01108605

S. Lauro-beltrão-costa, H. Al-kiswany, M. Yang, and . Ripeanu, Supporting storage configuration for i/o intensive workflows, Proceedings of the 28th ACM International Conference on Supercomputing, ICS '14, pp.191-200

J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich et al., The case for RAMClouds, ACM SIGOPS Operating Systems Review, vol.43, issue.4, pp.92-105, 2010.
DOI : 10.1145/1713254.1713276

B. Fitzpatrick, Distributed caching with Memcached, Linux Journal, issue.1245, 2004.

D. Dewitt and J. Gray, Parallel database systems: the future of high performance database systems, Communications of the ACM, vol.35, issue.6, pp.85-98, 1992.
DOI : 10.1145/129888.129894

D. Kossmann, The state of the art in distributed query processing, ACM Computing Surveys, vol.32, issue.4, pp.422-469, 2000.
DOI : 10.1145/371578.371598

J. Gray and A. Reuters, Transaction processing: concepts and techniques, 1993.

M. Wiesmann, F. Pedone, A. Schiper, B. Kemme, and G. Alonso, Understanding replication in databases and distributed systems, Proceedings 20th IEEE International Conference on Distributed Computing Systems, pp.464-474, 2000.
DOI : 10.1109/ICDCS.2000.840959

B. G. Tudorica and C. Bucur, A comparison between several NoSQL databases with comments and notes, 2011 RoEduNet International Conference 10th Edition: Networking in Education and Research, pp.1-5, 2011.
DOI : 10.1109/RoEduNet.2011.5993686

K. Chodorow, MongoDB: the definitive guide, 2013.

J. , C. Anderson, J. Lehnardt, and N. Slater, CouchDB: the definitive guide, 2010.

F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach et al., Bigtable, ACM Transactions on Computer Systems, vol.26, issue.2, pp.4-5, 2008.
DOI : 10.1145/1365815.1365816

M. N. Vora, Hadoop-hbase for large-scale data, Proceedings of the 2011 International Conference on Computer Science and Network Technology, pp.601-605, 2011.

A. Rajasekar, R. Moore, C. Hou, A. L. Lee, R. Christopher et al., iRODS Primer: Integrated Rule-Oriented Data System, Synthesis Lectures on Information Concepts, Retrieval, and Services, pp.2010-2024
DOI : 10.2200/S00233ED1V01Y200912ICR012

B. Wei, G. Fedak, and F. Cappello, Collaborative data distribution with bittorrent for computational desktop grids, Parallel and Distributed Computing , 2005. ISPDC 2005. The 4th International Symposium on, pp.250-257, 2005.
URL : https://hal.archives-ouvertes.fr/inria-00001041

G. Fedak, H. He, and F. Cappello, BitDew: A programmable environment for large-scale data management and distribution, 2008 SC, International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-12, 2008.
DOI : 10.1109/SC.2008.5213939

URL : https://hal.archives-ouvertes.fr/inria-00216126

T. Goodale, S. Jha, H. Kaiser, T. Kielmann, P. Kleijer et al., SAGA: A Simple API for Grid Applications. High-level application programming on the Grid, Computational Methods in Science and Technology, vol.12, issue.1, pp.7-20, 2006.
DOI : 10.12921/cmst.2006.12.01.07-20

T. Kosar and M. Livny, Stork: making data placement a first class citizen in the grid, 24th International Conference on Distributed Computing Systems, 2004. Proceedings., pp.342-349, 2004.
DOI : 10.1109/ICDCS.2004.1281599

J. Xu and R. Figueiredo, GatorShare, Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC '10, pp.776-786, 2010.
DOI : 10.1145/1851476.1851588

I. Foster and . Online, Globus Online: Accelerating and Democratizing Science through Cloud-Based Services, IEEE Internet Computing, vol.15, issue.3, pp.70-73, 2011.
DOI : 10.1109/MIC.2011.64

I. Mandrichenko, W. Allcock, and T. Perelmutov, Gridftp v2 protocol description

K. Asanovic, R. Bodik, J. Demmel, T. Keaveny, K. Keutzer et al., A view of the parallel computing landscape, Communications of the ACM, vol.52, issue.10, pp.56-67, 2009.
DOI : 10.1145/1562764.1562783

H. Kasim, V. March, R. Zhang, and S. See, Survey on Parallel Programming Model, Network and Parallel Computing, pp.266-275, 2008.
DOI : 10.1049/ip-cds:20040434

J. Dean and S. Ghemawat, MapReduce, Communications of the ACM, vol.51, issue.1, pp.107-113, 2008.
DOI : 10.1145/1327452.1327492

K. Lee, Y. Lee, H. Choi, Y. D. Chung, and B. Moon, Parallel data processing with MapReduce, ACM SIGMOD Record, vol.40, issue.4, pp.11-20, 2012.
DOI : 10.1145/2094114.2094118

C. Doulkeridis and K. Nørvåg, A survey of large-scale analytical query processing in MapReduce, The VLDB Journal, vol.21, issue.5, pp.355-380
DOI : 10.1007/s00778-013-0319-9

T. Condie, N. Conway, P. Alvaro, M. Joseph, K. Hellerstein et al., Mapreduce online, Proceedings of the 7th USENIX conference on Networked Systems Design and Implementation, 2010.

J. Ekanayake, H. Li, B. Zhang, T. Gunarathne, S. Bae et al., Twister, Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC '10, pp.810-818, 2010.
DOI : 10.1145/1851476.1851593

Y. Bu, B. Howe, M. Balazinska, and M. D. Ernst, HaLoop, Proceedings of the VLDB Endowment, pp.285-296, 2010.
DOI : 10.14778/1920841.1920881

C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins, Pig latin, Proceedings of the 2008 ACM SIGMOD international conference on Management of data , SIGMOD '08, pp.1099-1110, 2008.
DOI : 10.1145/1376616.1376726

R. Lee, T. Luo, Y. Huai, F. Wang, Y. He et al., YSmart: Yet Another SQL-to-MapReduce Translator, 2011 31st International Conference on Distributed Computing Systems, pp.25-36, 2011.
DOI : 10.1109/ICDCS.2011.26

A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka et al., Hive, Proceedings of the VLDB Endowment, pp.1626-1629, 2009.
DOI : 10.14778/1687553.1687609

J. Lin, Mapreduce is good enough? if all you have is a hammer, throw away everything that's not a nail! Big Data, pp.28-37

A. Pavlo, E. Paulson, A. Rasin, D. J. Abadi, D. J. Dewitt et al., A comparison of approaches to largescale data analysis, Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, SIGMOD '09, pp.165-178, 2009.

C. Moretti, J. Bulosan, D. Thain, and P. J. Flynn, Allpairs: An abstraction for data-intensive cloud computing, Proceedings of the International Symposium on Parallel and Distributed Processing, pp.1-11, 2008.

M. Zaharia, M. Chowdhury, J. Michael, S. Franklin, I. Shenker et al., Spark: Cluster computing with working sets, Proceedings of the 2nd USENIX conference on Hot Topics in Cloud Computing, pp.10-10, 2010.

K. Taura, K. Kaneda, T. Endo, and A. Yonezawa, Phoenix: a parallel programming model for accommodating dynamically joining/leaving resources, Proceedings of the Ninth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '03, pp.216-229, 2003.

G. Malewicz, M. H. Austern, J. Aart, J. C. Bik, I. Dehnert et al., Pregel: A system for large-scale graph processing, Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD '10, pp.135-146, 2010.

Y. Low, J. E. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin et al., Graphlab: A new framework for parallel machine learning, Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, 1408.

Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola et al., Distributed GraphLab, Proceedings of the VLDB Endowment, pp.716-727, 2012.
DOI : 10.14778/2212351.2212354

J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin, Powergraph: Distributed graph-parallel computation on natural graphs, Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI'12, pp.17-30

R. Stephens, A survey of stream processing, Acta Informatica, vol.34, issue.7, pp.491-541, 1997.
DOI : 10.1007/s002360050095

B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom, Models and issues in data stream systems, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems , PODS '02, pp.1-16, 2002.
DOI : 10.1145/543613.543615

D. J. Abadi, D. Carney, U. Çetintemel, M. Cherniack, C. Convey et al., Aurora: a new model and architecture for data stream management, The VLDB Journal The International Journal on Very Large Data Bases, vol.12, issue.2, pp.120-139, 2003.
DOI : 10.1007/s00778-003-0095-z

S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein et al., TelegraphCQ, Proceedings of the 2003 ACM SIGMOD international conference on on Management of data , SIGMOD '03, pp.668-668, 2003.
DOI : 10.1145/872757.872857

S. Madden and M. J. Franklin, Fjording the stream: an architecture for queries over streaming sensor data, Proceedings 18th International Conference on Data Engineering, pp.555-566, 2002.
DOI : 10.1109/ICDE.2002.994774

R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu et al., Query processing, approximation, and resource management in a data stream management system, 2002.

J. Hwang, M. Balazinska, A. Rasin, U. Çetintemel, M. Stonebraker et al., High-availability algorithms for distributed stream processing, Proceedings of the 21st International Conference on Data Engineering, ICDE 2005, pp.779-790, 2005.

D. Logothetis, C. Olston, B. Reed, K. C. Webb, and K. Yocum, Stateful bulk processing for incremental analytics, Proceedings of the 1st ACM symposium on Cloud computing, SoCC '10, pp.51-62, 2010.
DOI : 10.1145/1807128.1807138

S. Burckhardt, D. Leijen, C. Sadowski, J. Yi, and T. Ball, Two for the price of one, ACM SIGPLAN Notices, vol.46, issue.10, pp.427-444, 2011.
DOI : 10.1145/2076021.2048101

D. Peng and F. Dabek, Large-scale incremental processing using distributed transactions and notifications, Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, OSDI'10, pp.1-15

S. Venkataraman, I. Roy, A. Auyoung, and R. S. Schreiber, Using r for iterative and incremental processing, Proceedings of the 4th USENIX Conference on Hot Topics in Cloud Ccomputing, HotCloud'12, pp.11-11

R. Ihaka and R. Gentleman, R: A language for data analysis and graphics, Journal of Computational and Graphical Statistics, vol.5, issue.3, pp.299-314, 1996.

P. Bhatotia, A. Wieder, R. Rodrigues, U. A. Acar, and R. Pasquini, Incoop, Proceedings of the 2nd ACM Symposium on Cloud Computing, SOCC '11, pp.1-7, 2011.
DOI : 10.1145/2038916.2038923

P. Bhatotia, A. Wieder, R. ?stemi-ekin-akku?, U. A. Rodrigues, and . Acar, Large-scale incremental data processing with change propagation, Proceedings of the 3rd USENIX Conference on Hot Topics in Cloud Computing, pp.18-18

P. Dimitri and J. N. Bertsekas, Tsitsiklissome aspects of parallel and distributed iterative algorithms?a survey, Automatica, issue.271, pp.3-21, 1991.

J. Ekanayake, S. Pallickara, and G. Fox, MapReduce for Data Intensive Scientific Analyses, 2008 IEEE Fourth International Conference on eScience, pp.277-284, 2008.
DOI : 10.1109/eScience.2008.59

B. Li, E. Mazur, Y. Diao, A. Mcgregor, and P. Shenoy, A platform for scalable one-pass analytics using MapReduce, Proceedings of the 2011 international conference on Management of data, SIGMOD '11, pp.985-996, 2011.
DOI : 10.1145/1989323.1989426

M. Wilde, M. Hategan, J. M. Wozniak, B. Clifford, D. S. Katz et al., Swift: A language for distributed parallel scripting, Parallel Computing, vol.37, issue.9, pp.633-652, 2011.
DOI : 10.1016/j.parco.2011.05.005

C. Ching-lian, F. Tang, P. Issac, and A. Krishnan, GEL: Grid execution language, Journal of Parallel and Distributed Computing, vol.65, issue.7, pp.857-869, 2005.
DOI : 10.1016/j.jpdc.2005.03.002

D. Thain, T. Tannenbaum, and M. Livny, Distributed computing in practice: The condor experience. Concurrency and Computation: Practice and Experience, pp.323-356, 2005.

E. Deelman, K. Vahi, G. Juve, M. Rynge, S. Callaghan et al., Pegasus, a workflow management system for science automation, Future Generation Computer Systems, vol.46, p.2014
DOI : 10.1016/j.future.2014.10.008

M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly, Dryad: Distributed data-parallel programs from sequential building blocks, Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems, EuroSys '07, pp.59-72, 2007.

L. Yogesh, B. Simmhan, D. Plale, and . Gannon, A survey of data provenance techniques, p.20, 2005.

P. Kiran-kumar-muniswamy-reddy, M. Macko, and . Seltzer, Provenance for the cloud, Proceedings of the 8th USENIX Conference on File and Storage Technologies, FAST'10, pp.15-29

L. Moreau, P. Groth, S. Miles, J. Vazquez-salceda, J. Ibbotson et al., The provenance of electronic data, Communications of the ACM, vol.51, issue.4, pp.52-58, 2008.
DOI : 10.1145/1330311.1330323

L. Moreau, J. Freire, J. Futrelle, R. Mcgrath, J. Myers et al., The open provenance model, 2007.

L. Moreau, B. Plale, S. Miles, C. Goble, P. Missier et al., The open provenance model (v1.01), 2008.

L. Moreau, B. Clifford, J. Freire, Y. Gil, P. Groth et al., The open provenance model? core specification (v1. 1), Future Generation Computer Systems, vol.94, issue.2, p.20, 2009.

L. Moreau, J. Freire, J. Futrelle, E. Robert, J. Mcgrath et al., The Open Provenance Model: An Overview, Provenance and Annotation of Data and Processes, pp.323-326, 2008.
DOI : 10.1007/978-3-540-89965-5_31

Y. Simmhan, P. Groth, and L. Moreau, Special Section: The third provenance challenge on using the open provenance model for interoperability, Future Generation Computer Systems, vol.27, issue.6, pp.737-742, 2011.
DOI : 10.1016/j.future.2010.11.020

P. Groth and L. Moreau, Representing distributed systems using the Open Provenance Model, Future Generation Computer Systems, vol.27, issue.6, pp.757-765, 2011.
DOI : 10.1016/j.future.2010.10.001

N. Kwasnikowska and J. Van-den-bussche, Mapping the NRC Dataflow Model to the Open Provenance Model, Provenance and Annotation of Data and Processes, pp.3-16, 2008.
DOI : 10.1007/978-3-540-89965-5_3

Y. Simmhan and R. Barga, Analysis of approaches for supporting the Open Provenance Model: A case study of the Trident workflow workbench, Future Generation Computer Systems, vol.27, issue.6, pp.790-796, 2011.
DOI : 10.1016/j.future.2010.10.005

P. Buneman, A. Chapman, and J. Cheney, Provenance management in curated databases, Proceedings of the 2006 ACM SIGMOD international conference on Management of data , SIGMOD '06, pp.539-550, 2006.
DOI : 10.1145/1142473.1142534

R. D. Stevens, A. J. Robinson, and C. A. Goble, myGrid: personalised bioinformatics on the information grid, Bioinformatics, vol.19, issue.Suppl 1, pp.302-304, 2003.
DOI : 10.1093/bioinformatics/btg1041

P. Missier, . Satyas, J. Sahoo, C. Zhao, A. Goble et al., Janus: From Workflows to Semantic Provenance and Linked Open Data, Provenance and Annotation of Data and Processes, pp.129-141, 2010.
DOI : 10.1007/978-3-540-30475-3_8

S. Miles, P. Groth, M. Branco, and L. Moreau, The Requirements of Using Provenance in e-Science Experiments, Journal of Grid Computing, vol.5, issue.3, pp.1-25, 2007.
DOI : 10.1007/s10723-006-9055-3

P. Groth, M. Luck, and L. Moreau, A Protocol for Recording Provenance in Service-Oriented Grids, Principles of Distributed Systems, pp.124-139, 2005.
DOI : 10.1007/11516798_9

B. Cao, B. Plale, G. Subramanian, E. Robertson, and Y. Simmhan, Provenance Information Model of Karma Version 3, 2009 Congress on Services, I, pp.348-351, 2009.
DOI : 10.1109/SERVICES-I.2009.54

I. Foster, J. Vöckler, M. Wilde, and Y. Zhao, Chimera: a virtual data system for representing, querying, and automating data derivation, Proceedings 14th International Conference on Scientific and Statistical Database Management, pp.37-46, 2002.
DOI : 10.1109/SSDM.2002.1029704

U. Braun, S. Garfinkel, D. A. Holland, M. I. Kiran-kumar-muniswamy-reddy, and . Seltzer, Issues in Automatic Provenance Collection, Provenance and Annotation of Data, pp.171-183, 2006.
DOI : 10.1007/11890850_18

D. A. Kiran-kumar-muniswamy-reddy, U. Holland, M. Braun, and . Seltzer, Provenance-aware storage systems, Proceedings of the 2006 USENIX Annual Technical Conference, USENIX, pp.43-56, 2006.

P. Anant, S. Bhardwaj, A. Bhattacherjee, A. Chavan, A. J. Deshpande et al., Datahub: Collaborative data science & dataset version management at scale. CoRR, p.2014

Y. Demchenko, P. Grosso, C. De-laat, and P. Membrey, Addressing big data issues in Scientific Data Infrastructure, 2013 International Conference on Collaboration Technologies and Systems (CTS), pp.48-55
DOI : 10.1109/CTS.2013.6567203

T. Ho and D. Abramson, Active Data: Supporting the Grid Data Life Cycle, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07), pp.39-46, 2007.
DOI : 10.1109/CCGRID.2007.16

L. Ramakrishnan, D. Ghoshal, V. Hendrix, E. Feller, P. Mantha et al., Storage and Data Life Cycle Management in Cloud Environments with FRIEDA, Cloud Computing for Data Intensive Applications, p.2015
DOI : 10.1007/978-1-4939-1905-5_15

J. L. Peterson, Petri Nets, ACM Computing Surveys, vol.9, issue.3, pp.223-252, 1977.
DOI : 10.1145/356698.356702

T. Murata, Petri nets: Properties, analysis and applications, Proceedings of the IEEE, pp.541-580, 1989.
DOI : 10.1109/5.24143

W. M. Van-der-aalst, THE APPLICATION OF PETRI NETS TO WORKFLOW MANAGEMENT, Journal of Circuits, Systems and Computers, vol.08, issue.01, pp.21-66, 1998.
DOI : 10.1142/S0218126698000043

C. V. Ramamoorthy and G. S. Ho, Performance evaluation of asynchronous concurrent systems using petri nets. Software Engineering, IEEE Transactions, issue.65, pp.440-449, 1980.

K. Jensen, Coloured Petri Nets, Petri Nets: Central Models and Their Properties, pp.248-299, 1987.
DOI : 10.1007/978-3-540-47919-2_10

P. Th, P. A. Eugster, R. Felber, A. Guerraoui, and . Kermarrec, The many faces of publish/subscribe, ACM Computing Surveys, vol.35, pp.114-131, 2003.

R. Love, Kernel korner: intro to inotify, Linux Journal, vol.63, issue.1392, pp.8-59, 2005.

R. Bolze, F. Cappello, E. Caron, M. Daydé, F. Desprez et al., El-Ghazali Talbi, and Iréa Touche. Grid5000: A large scale highly reconfigurable experimental grid testbed, International Journal on High Performance Computing and Applications, 2006.

B. Tang, M. Moca, S. Chevalier, H. He, and G. Fedak, Towards MapReduce for Desktop Grid Computing, 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp.193-200, 2010.
DOI : 10.1109/3PGCIC.2010.33

URL : https://hal.archives-ouvertes.fr/hal-00687553

M. Gavish and D. Donoho, A Universal Identifier for Computational Results, Proceedings of the International Conference on Computational Science, pp.637-647, 2011.
DOI : 10.1016/j.procs.2011.04.067