D. Laney, 3D Data Management: Controlling Data Volume, Velocity and Variety, META Group Research Note, vol.6, 2001.

J. Bigot, Du support générique d'opérateurs de composition dans les modèles de composants logiciels, application au calcul scientifique, 2010.

J. Berlinska and M. Drozdowski, Scheduling divisible MapReduce computations, Journal of Parallel and Distributed Computing, vol.71, issue.3, pp.450-459, 2010.
DOI : 10.1016/j.jpdc.2010.12.004

G. Antoniu, J. Bigot, C. Blanchet, L. Bougé, F. Briant et al., Scalable data management for map-reduce-based data-intensive applications: a view for cloud and hybrid infrastructures, International Journal of Cloud Computing, vol.2, issue.2/3, pp.150-170, 2013.
DOI : 10.1504/IJCC.2013.055265

G. Antoniu, J. Bigot, C. Blanchet, L. Bougé, F. Briant et al., Towards Scalable Data Management for Map-Reduce-based Data- Intensive Applications on Cloud and Hybrid Infrastructures, 1st International IBM Cloud Academy Conference -ICA CON 2012. Research Triangle Park, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00767029

S. Gault and C. Perez, Dynamic Scheduling of MapReduce Shuffle under Bandwidth Constraints, Europar 2014 Workshop Proceedings, 2014.
DOI : 10.1007/978-3-319-14325-5_11

URL : https://hal.archives-ouvertes.fr/hal-01254055

S. Gault, Ordonnancement Dynamique des Transferts dans MapReduce sous Contrainte de Bande Passante, ComPAS'13 / RenPar'21 -21es Rencontres francophones du Parallélisme, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00820361

F. Desprez, S. Gault, and F. Suter, Scheduling/Data Management Heuristics
URL : https://hal.archives-ouvertes.fr/hal-00759546

S. Gault and F. Desprez, Dynamic Scheduling of MapReduce Shuffle under Bandwidth Constraints
DOI : 10.1007/978-3-319-14325-5_11

URL : https://hal.archives-ouvertes.fr/hal-01254055

J. Von-neumann, First draft of a report on the EDVAC, IEEE Annals of the History of Computing, vol.15, issue.4, 1993.
DOI : 10.1109/85.238389

C. E. Leiserson, Fat-trees: Universal networks for hardware-efficient supercomputing, IEEE Transactions on Computers C-34, pp.892-901, 1985.
DOI : 10.1109/TC.1985.6312192

D. P. Anderson, BOINC: A System for Public-Resource Computing and Storage, Fifth IEEE/ACM International Workshop on Grid Computing, pp.4-10, 2004.
DOI : 10.1109/GRID.2004.14

K. Shvachko, H. Kuang, S. Radia, and R. Chansler, The Hadoop Distributed File System, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp.1-10, 2010.
DOI : 10.1109/MSST.2010.5496972

B. Nicolae, G. Antoniu, L. Bougé, and D. Moise, BlobSeer: Next-generation data management for large scale infrastructures, Journal of Parallel and Distributed Computing, vol.71, issue.2, pp.168-184, 2010.
DOI : 10.1016/j.jpdc.2010.08.004

URL : https://hal.archives-ouvertes.fr/inria-00511414

C. Szyperski, Component Software: Beyond Object-Oriented Programming, 2002.

J. Bigot, Z. Hou, C. Pérez, and V. Pichon, A low level component model easing performance portability of HPC applications, Computing, vol.4, issue.5, pp.1115-1130, 2014.
DOI : 10.1007/s00607-013-0368-3

URL : https://hal.archives-ouvertes.fr/hal-00911231

J. Bigot and C. Pérez, High Performance Scientific Computing with Special Emphasis on Current Capabilities and Future Perspectives Advances in Parallel Computing, Chap. On High Performance Composition Operators in Component Models, pp.182-201, 2011.

P. Message and . Forum, MPI: A Message-Passing Interface Standard, 1994.

C. L. Lawson, R. J. Hanson, D. R. Kincaid, and F. T. Krogh, Basic Linear Algebra Subprograms for Fortran Usage, ACM Transactions on Mathematical Software, vol.5, issue.3, pp.308-323, 1979.
DOI : 10.1145/355841.355847

J. Planas, M. Rosa, E. Badia, J. Ayguadé, and . Labarta, Hierarchical Taskbased Programming with StarSs, International Journal of High Performance Computing Applications, vol.233, pp.284-299, 2009.

D. Romain, B. Stéphane, and F. Bodin, HMPP TM : A Hybrid Multi-core Parallel Programming Environment, First Workshop on General Purpose Processing on Graphics Processing Units, 2007.

. Openacc-working and . Group, The OpenACC Application Programming Interface, 2011.

C. Morin, R. Lottiaux, G. Vallée, P. Gallard, G. Utard et al., Kerrighed: A Single System Image Cluster Operating System for High Performance Computing, Par 2003 Parallel Processing, pp.1291-1294, 2003.
DOI : 10.1007/978-3-540-45209-6_175

URL : https://hal.archives-ouvertes.fr/hal-01271227

A. Barak, S. Guday, G. Richard, and . Wheeler, The MOSIX Distributed Operating System: Load Balancing for UNIX, 1993.
DOI : 10.1007/3-540-56663-5

G. Staples, TORQUE Resource Manager, Proceedings of the 2006 ACM/IEEE Conference on Supercomputing. SC '06, 2006.

C. Morin, XtreemOS: A Grid Operating System Making your Computer Ready for Participating in Virtual Organizations " . In: Object and Component-Oriented Real- Time Distributed Computing, ISORC'07. 10th IEEE International Symposium on. IEEE, pp.393-402, 2007.
URL : https://hal.archives-ouvertes.fr/hal-01271216

P. Mell and T. Grance, The NIST Definition of Cloud Computing, 2011.
DOI : 10.6028/NIST.SP.800-145

F. Cappello, F. Desprez, M. Dayde, E. Jeannot, Y. Jegou et al., Grid'5000: A Large Scale, Reconfigurable, Controlable and Monitorable Grid Platform, 6th IEEE/ACM International Workshop on Grid Computing-GRID 2005, 2005.
URL : https://hal.archives-ouvertes.fr/inria-00000284

S. Ibrahim, H. Jin, L. Lu, B. He, G. Antoniu et al., Maestro: Replica-Aware Map Scheduling for MapReduce, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), pp.2012-2024, 2012.
DOI : 10.1109/CCGrid.2012.122

URL : https://hal.archives-ouvertes.fr/hal-00670813

M. Zaharia, D. Borthakur, J. S. Sarma, K. Elmeleegy, S. Shenker et al., Job Scheduling for Multi-User MapReduce Clusters, pp.2009-55, 2009.

M. Zaharia and D. Borthakur, Delay scheduling, Proceedings of the 5th European conference on Computer systems, EuroSys '10, p.9781605585772, 2010.
DOI : 10.1145/1755913.1755940

S. Ibrahim, H. Jin, L. Lu, S. Wu, B. He et al., LEEN: Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud, 2010 IEEE Second International Conference on Cloud Computing Technology and Science, pp.17-24, 2010.
DOI : 10.1109/CloudCom.2010.25

J. Jin, J. Luo, A. Song, F. Dong, and R. Xiong, BAR: An Efficient Data Locality Driven Task Scheduling Algorithm for Cloud Computing, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp.295-304, 2011.
DOI : 10.1109/CCGrid.2011.55

C. L. Abad, Y. Lu, and R. H. Campbell, DARE: Adaptive Data Replication for Efficient Cluster Scheduling, 2011 IEEE International Conference on Cluster Computing, pp.159-168, 2011.
DOI : 10.1109/CLUSTER.2011.26

M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar et al., Quincy, Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, SOSP '09, pp.261-276, 2009.
DOI : 10.1145/1629575.1629601

E. Richard, M. J. Ladner, and . Fischer, Parallel Prefix Computation, Journal of the ACM, vol.27, issue.4, pp.831-838, 1980.

R. Rivest, The MD5 Message-Digest Algorithm, Request for Comments, 1321.
DOI : 10.17487/rfc1321

. Secure-hash, N. Standard, and U. Dept, FIPS 180-1, Commerce, p.21, 1995.

M. Zaharia, A. Konwinski, D. Anthony, . Joseph, H. Randy et al., Improving MapReduce Performance in Heterogeneous Environments, 2008.

R. Gandhi and A. Sabne, Finding Stragglers in Hadoop, 2011.

Y. Kwon, M. Balazinska, B. Howe, and J. Rolia, SkewTune, Proceedings of the 2012 international conference on Management of Data, SIGMOD '12, pp.25-36
DOI : 10.1145/2213836.2213840

Y. Su, P. Chen, J. Chang, and C. Shieh, Variable-sized map and locality-aware reduce on public-resource grids, Future Generation Computer Systems, vol.27, issue.6, pp.27-33, 2011.
DOI : 10.1016/j.future.2010.09.001

Q. Chen, M. Guo, Q. Deng, L. Zheng, S. Guo et al., HAT: history-based auto-tuning MapReduce in heterogeneous environments, The Journal of Supercomputing, vol.39, issue.1, pp.1038-1054, 2013.
DOI : 10.1007/s11227-011-0682-5

S. Seo, I. Jang, K. Woo, I. Kim, J. Kim et al., HPMR: Prefetching and pre-shuffling in shared MapReduce computation environment, 2009 IEEE International Conference on Cluster Computing and Workshops, 2009.
DOI : 10.1109/CLUSTR.2009.5289171

J. Ekanayake, H. Li, B. Zhang, T. Gunarathne, S. Bae et al., Twister, Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC '10, pp.810-818, 2010.
DOI : 10.1145/1851476.1851593

C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis, Evaluating MapReduce for Multi-core and Multiprocessor Systems, 2007 IEEE 13th International Symposium on High Performance Computer Architecture, pp.13-24, 2007.
DOI : 10.1109/HPCA.2007.346181

J. Talbot, M. Richard, C. Yoo, and . Kozyrakis, Phoenix++, Proceedings of the second international workshop on MapReduce and its applications, MapReduce '11, p.9781450307000, 2011.
DOI : 10.1145/1996092.1996095

Y. Wang, X. Que, W. Yu, D. Goldenberg, and D. Sehgal, Hadoop acceleration through network levitated merge, Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on, SC '11, p.1, 2011.
DOI : 10.1145/2063384.2063461

X. Lu, N. S. Islam, M. Wasi-ur-rahman, J. Jose, H. Subramoni et al., High-Performance Design of Hadoop RPC with RDMA over InfiniBand, 2013 42nd International Conference on Parallel Processing, pp.641-650, 2013.
DOI : 10.1109/ICPP.2013.78

N. S. Islam, M. W. Rahman, J. Jose, R. Rajachandrasekar, H. Wang et al., High performance RDMA-based design of HDFS over InfiniBand, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-12, 2012.
DOI : 10.1109/SC.2012.65

W. Fang, B. He, Q. Luo, and N. K. Govindaraju, Mars: Accelerating MapReduce with Graphics Processors, IEEE Transactions on Parallel and Distributed Systems, pp.608-620, 2010.

G. Fedak, H. He, and F. Cappello, BitDew: A programmable environment for large-scale data management and distribution, 2008 SC, International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-12, 2008.
DOI : 10.1109/SC.2008.5213939

URL : https://hal.archives-ouvertes.fr/inria-00216126

B. Tang, M. Moca, S. Chevalier, H. He, and G. Fedak, Towards MapReduce for Desktop Grid Computing, 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp.193-200, 2010.
DOI : 10.1109/3PGCIC.2010.33

URL : https://hal.archives-ouvertes.fr/hal-00687553

B. Nicolae, D. Moise, and G. Antoniu, BlobSeer: Bringing high throughput under heavy concurrency to Hadoop Map-Reduce applications, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2010.
DOI : 10.1109/IPDPS.2010.5470433

URL : https://hal.archives-ouvertes.fr/inria-00456801

Y. Bu, B. Howe, M. Balazinska, and M. D. Ernst, HaLoop, Proceedings of the VLDB Endowment, pp.1-2, 2010.
DOI : 10.14778/1920841.1920881

Y. Zhang, Q. Gao, L. Gao, and C. Wang, iMapReduce: A Distributed Computing Framework for Iterative Computation, Journal of Grid Computing, vol.10, issue.4, pp.47-68, 2012.
DOI : 10.1007/s10723-012-9204-9

E. Charles, . Leiserson, L. Ronald, C. Rivest, . Stein et al., Introduction to Algorithms, 2001.

E. Elnikety, T. Elsayed, and H. E. Ramadan, iHadoop: Asynchronous Iterations for MapReduce, 2011 IEEE Third International Conference on Cloud Computing Technology and Science, pp.81-90, 2011.
DOI : 10.1109/CloudCom.2011.21

T. Condie, N. Conway, P. Alvaro, J. M. Hellerstein, K. Elmeleegy et al., MapReduce Online, Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation. USENIX Association, pp.21-21, 2010.

H. Lin, X. Ma, J. Archuleta, M. Wu-chun-feng, Z. Gardner et al., MOON, Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC '10, pp.95-106, 2010.
DOI : 10.1145/1851476.1851489

URL : https://hal.archives-ouvertes.fr/in2p3-00024076

C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins, Pig latin, Proceedings of the 2008 ACM SIGMOD international conference on Management of data , SIGMOD '08, pp.1099-1110, 2008.
DOI : 10.1145/1376616.1376726

A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka et al., Hive -A Warehousing Solution Over a Map-Reduce Framework, Proceedings of the VLDB Endowment, pp.1626-1629, 2009.

J. Rumbaugh, I. Jacobson, and G. Booch, Unified Modeling Language Reference Manual, The (2nd Edition). Pearson Higher Education, p.321245628, 2004.

J. Luis-andré-barroso, U. Clidaras, and . Hölzle, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, 2013.

A. Greenberg, S. Kandula, D. A. Maltz, J. R. Hamilton, C. Kim et al., VL2: A Scalable and Flexible Data Center Network, Proceedings of the ACM SIGCOMM 2009 Conference on Data Communication, pp.51-62, 2009.

M. Al-fares, A. Loukissas, and A. Vahdat, A Scalable, Commodity Data Center Network Architecture, Proceedings of the ACM SIGCOMM 2008 Conference on Data Communication, pp.63-74, 2008.

A. Luiz and . Steffenel, Modeling Network Contention Effects on All-to-All Operations, IEEE International Conference on Cluster Computing, 2006.

H. Ballani, P. Costa, T. Karagiannis, and A. Rowstron, Towards Predictable Datacenter Networks, Proceedings of the ACM SIGCOMM 2011 Conference, p.242, 2011.

N. Grozev and R. Buyya, Performance Modelling and Simulation of Three-Tier Applications in Cloud and Multi-Cloud Environments, The Computer Journal, vol.58, issue.1, 2013.
DOI : 10.1093/comjnl/bxt107

H. Herodotou, Hadoop Performance Models, pp.1-16

G. Wang, A. R. Butt, P. Pandey, and K. Gupta, Using realistic simulation for performance analysis of mapreduce setups, Proceedings of the 1st ACM workshop on Large-Scale system and application performance, LSAP '09, 2009.
DOI : 10.1145/1552272.1552278

H. Herodotou, H. Lim, G. Luo, N. Borisov, L. Dong et al., Starfish: A Self-tuning System for Big Data Analytics, Conference on Innovative Data Systems Research, pp.261-272, 2011.

H. Herodotou and S. Babu, Profiling, What-if Analysis, and Cost-based Optimization of MapReduce Programs, Proceedings of the very large Data Bases Endowment, 2011.

A. Verma, L. Cherkasova, and R. Campbell, ARIA, Proceedings of the 8th ACM international conference on Autonomic computing, ICAC '11, pp.235-244, 2011.
DOI : 10.1145/1998582.1998637

G. Lee, Resource Allocation and Scheduling in Heterogeneous Cloud Environments

A. Wieder, P. Bhatotia, A. Post, and R. Rodrigues, Brief announcement, Proceeding of the 29th ACM SIGACT-SIGOPS symposium on Principles of distributed computing, PODC '10
DOI : 10.1145/1835698.1835795

A. Ganapathi, Y. Chen, A. Fox, R. Katz, and D. Patterson, Statistics-driven workload modeling for the Cloud, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010), pp.87-92, 2010.
DOI : 10.1109/ICDEW.2010.5452742

A. Ganapathi, H. Kuno, U. Dayal, J. L. Wiener, A. Fox et al., Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning, 2009 IEEE 25th International Conference on Data Engineering, pp.592-603, 2009.
DOI : 10.1109/ICDE.2009.130

R. Francis, M. I. Bach, and . Jordan, Kernel Independent Component Analysis, The Journal of Machine Learning Research, vol.3, pp.1-48, 2003.

K. Venkatesh, V. , and N. Nagappan, Characterizing Cloud Computing Hardware Reliability, Proceedings of the 1st ACM Symposium on Cloud Computing, pp.193-204, 2010.

D. Kim, F. Machida, and K. S. Trivedi, Availability Modeling and Analysis of a Virtualized System, 2009 15th IEEE Pacific Rim International Symposium on Dependable Computing, pp.365-371, 2009.
DOI : 10.1109/PRDC.2009.64

H. Qian, D. Medhi, and K. Trivedi, A Hierarchical Model to Evaluate Quality of Experience of Online Services hosted by Cloud Computing, International Symposium on Integrated Network Management, pp.105-112, 2011.

A. Undheim, A. Chilwan, and P. Heegaard, Differentiated Availability in Cloud Computing SLAs, 2011 IEEE/ACM 12th International Conference on Grid Computing, pp.129-136, 2011.
DOI : 10.1109/Grid.2011.25

Q. Chen, D. Zhang, M. Guo, Q. Deng, and S. Guo, SAMR: A Self-adaptive MapReduce Scheduling Algorithm in Heterogeneous Environment, 2010 10th IEEE International Conference on Computer and Information Technology, pp.2736-2743, 2010.
DOI : 10.1109/CIT.2010.458

B. Veeravalli, D. Ghose, V. Mani, and T. Robertazzi, Scheduling Divisible Loads in Parallel and Distributed Systems, p.292, 1996.

M. Frank and P. Martini, Practical experiences with a transport layer extension for end-to-end bandwidth regulation, Proceedings of 22nd Annual Conference on Local Computer Networks, pp.337-346, 1997.
DOI : 10.1109/LCN.1997.631003

F. Ian, J. Akyildiz, D. Liebeherr, and . Sarkar, Bandwidth Regulation of Real- Time Traffic Classes in Internetworks, Computer Networks and ISDN Systems, pp.855-872, 1996.

M. Frank and P. Martini, Performance analysis of an end-to-end bandwidth regulation scheme, Proceedings. Sixth International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (Cat. No.98TB100247), pp.133-138, 1998.
DOI : 10.1109/MASCOT.1998.693686

. Microsoft-blog-editor, Windows Azure General Availability. url: http : / / blogs . microsoft.com/blog, 2010.

. Hadoop, . Poweredby-hadoop, and . Wiki, url: https : / / wiki . apache . org

. Apache-software-foundation, Apache Hadoop NextGen MapReduce (YARN) url: http: //hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site

O. Owen, A. Malley, and . Murthy, Hadoop Sorts a Petabyte in 16.25 Hours and a Terabyte in 62 Seconds. url: https://developer

O. Owen, A. Malley, and . Murthy, Winning a 60 Second Dash with a Yellow Elephant. url: https

O. Owen, A. Malley, and . Murthy, Scaling Hadoop to 4000 Nodes at Yahoo! url: https

A. Murthy, The Next Generation of Apache Hadoop MapReduce. url: https : / / developer . yahoo . com / blogs / hadoop / next -generation -apache -hadoop - mapreduce-3061

!. Yahoo and . Inc, url: https

. Microsoft, |. Hdinsight, and . Hadoop, url: http://azure.microsoft.com/en-us/ services