H. Abbasi, M. Wolf, G. Eisenhauer, S. Klasky, K. Schwan et al., DataStager, Proceedings of the 18th ACM international symposium on High performance distributed computing, HPDC '09, pp.39-48, 2009.
DOI : 10.1145/1551609.1551618

S. Agarwal, B. Mozafari, A. Panda, H. Milner, S. Madden et al., BlinkDB, Proceedings of the 8th ACM European Conference on Computer Systems, EuroSys '13, pp.29-42, 2013.
DOI : 10.1145/2465351.2465355

F. Ahmad, S. T. Chakradhar, A. Raghunathan, and T. Vijaykumar, ShuffleWatcher: shuffle-aware scheduling in multi-tenant MapReduce clusters, Annual Technical Conference, pp.1-12

F. Ahmad, S. Lee, M. Thottethodi, and T. Vijaykumar, Puma: Purdue MapReduce benchmarks suite, 2012.

P. Ali, K. Carns, D. Iskra, S. Kimpe, R. Lang et al., Scalable I/O forwarding framework for high-performance computing systems, 2009 IEEE International Conference on Cluster Computing and Workshops, pp.1-10, 2009.
DOI : 10.1109/CLUSTR.2009.5289188
URL : http://www.cse.ohio-state.edu/~alin/papers/iofsl-cluster09.pdf

G. Ananthanarayanan, S. Agarwal, S. Kandula, A. Greenberg, I. Stoica et al., Scarlett, Proceedings of the sixth conference on Computer systems, EuroSys '11, pp.287-300, 2011.
DOI : 10.1145/1966445.1966472

G. Ananthanarayanan, C. Douglas, R. Ramakrishnan, S. Rao, and I. Stoica, True elasticity in multi-tenant data-intensive compute clusters, Proceedings of the Third ACM Symposium on Cloud Computing, SoCC '12, pp.1-7, 2012.
DOI : 10.1145/2391229.2391253

M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz et al., A view of cloud computing, Communications of the ACM, vol.53, issue.4, pp.50-58, 2010.
DOI : 10.1145/1721654.1721672

M. Armbrust, R. S. Xin, C. Lian, Y. Huai, D. Liu et al., Spark SQL, Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD '15, pp.1383-1394, 2015.
DOI : 10.1007/3-540-59451-5_2

G. Aupy, A. Gainaru, and V. L. Fèvre, Periodic I/O scheduling for supercomputers, p.2017
DOI : 10.1007/978-3-319-72971-8_3
URL : https://hal.archives-ouvertes.fr/hal-01654645

L. A. Barroso, J. Clidaras, and U. Hölzle, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Synthesis Lectures on Computer Architecture, pp.1-154, 2013.
DOI : 10.2200/S00193ED1V01Y200905CAC006

A. Batsakis, R. Burns, A. Kanevsky, J. Lentini, and T. Talpey, CA-NFS, ACM Transactions on Storage, vol.5, issue.4, pp.1-24, 2009.
DOI : 10.1145/1629080.1629085

A. Beam, Available: https://beam.apache.org, 2016.

J. Bent, S. Faibish, J. Ahrens, G. Grider, J. Patchett et al., Jitter-free co-processing on a prototype exascale storage stack, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), pp.2012-2013
DOI : 10.1109/MSST.2012.6232382
URL : http://storageconference.org/2012/Papers/18.Short.2.JitterFree.pdf

A. Bhatele, K. Mohror, S. H. Langer, and K. E. Isaacs, There goes the neighborhood, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on, SC '13, pp.1-12, 2013.
DOI : 10.1145/2503210.2503247

N. Bluewaters-project, S. Center, and . Applications, Available: http://www.ncsa.illinois, 2010.

R. Bolze, F. Cappello, E. Caron, M. Daydé, F. Desprez et al., Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed, The International Journal of High Performance Computing Applications, vol.2, issue.2, pp.481-494, 2006.
DOI : 10.1145/1060289.1060313
URL : https://hal.archives-ouvertes.fr/hal-00684943

Y. Bu, B. Howe, M. Balazinska, and M. D. Ernst, HaLoop, Proceedings of the VLDB Endowment, vol.3, issue.1-2, pp.285-296, 2010.
DOI : 10.14778/1920841.1920881

P. Buneman and W. Tan, Provenance in databases, Proceedings of the 2007 ACM SIGMOD international conference on Management of data , SIGMOD '07, pp.1171-1173, 2007.
DOI : 10.1145/1247480.1247646

R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility, Future Generation Computer Systems, vol.25, issue.6, pp.599-616, 2009.
DOI : 10.1016/j.future.2008.12.001
URL : http://www.gridbus.org/reports/CloudITPlatforms2008.pdf

P. Carbone, A. Katsifodimos, S. Ewen, V. Markl, S. Haridi et al., Apache Flink: stream and batch processing in a single engine, Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, vol.36, issue.4, pp.28-38, 2015.

P. Carns, R. Latham, R. Ross, K. Iskra, S. Lang et al., 24/7 Characterization of petascale I/O workloads, 2009 IEEE International Conference on Cluster Computing and Workshops, pp.1-10, 2009.
DOI : 10.1109/CLUSTR.2009.5289150

N. Chaimov, A. Malony, S. Canon, C. Iancu, K. Z. Ibrahim et al., Scaling Spark on HPC Systems, Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC '16, pp.97-110, 2016.
DOI : 10.1145/2886107.2886110

Y. Chen, S. Alspaugh, and R. Katz, Interactive analytical processing in big data systems, Proceedings of the VLDB Endowment, vol.5, issue.12, pp.1802-1813, 2012.
DOI : 10.14778/2367502.2367519

N. Cheriere, P. Donat-bouillud, S. Ibrahim, and M. Simonin, On the usability of shortest remaining time first policy in shared Hadoop clusters, Proceedings of the 31st Annual ACM Symposium on Applied Computing, SAC '16, pp.426-431, 2016.
DOI : 10.1145/1755913.1755940
URL : https://hal.archives-ouvertes.fr/hal-01239341

B. Cho, M. Rahman, T. Chajed, I. Gupta, C. Abad et al., Natjam, Proceedings of the 4th annual Symposium on Cloud Computing, SOCC '13, pp.1-17, 2013.
DOI : 10.1145/2523616.2523624

J. Dean, Large-scale distributed systems at Google: current systems and future directions, International Workshop on Large Scale Distributed Systems and Middleware, Tutorial, 2009.

J. Dean and S. Ghemawat, MapReduce, Communications of the ACM, vol.51, issue.1, pp.107-113, 2008.
DOI : 10.1145/1327452.1327492

C. , D. Martino, Z. Kalbarczyk, R. K. Iyer, F. Baccanico et al., Lessons learned from the analysis of system failures at petascale: the case of BlueWaters, International Conference on Dependable Systems and Networks, pp.610-621, 2014.

F. Dinu and T. Ng, Understanding the effects and implications of compute node related failures in hadoop, Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing, HPDC '12, pp.187-198, 2012.
DOI : 10.1145/2287076.2287108

F. Dinu and T. E. Ng, RCMP: Enabling Efficient Recomputation Based Failure Resilience for Big Data Analytics, 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp.962-971, 2014.
DOI : 10.1109/IPDPS.2014.102
URL : http://www.cs.rice.edu/%7Efd2/pdf/IPDPS14.pdf

J. Dongarra and M. A. Heroux, Toward a new metric for ranking high performance computing systems, Sandia Report Tech. Rep, pp.2013-4744, 2013.

M. Dorier, G. Antoniu, F. Cappello, M. Snir, and L. Orf, Damaris: How to Efficiently Leverage Multicore Parallelism to Achieve Scalable, Jitter-free I/O, 2012 IEEE International Conference on Cluster Computing, pp.2012-155
DOI : 10.1109/CLUSTER.2012.26
URL : https://hal.archives-ouvertes.fr/hal-00715252

M. Dorier, G. Antoniu, F. Cappello, M. Snir, R. Sisneros et al., Damaris, ACM Transactions on Parallel Computing, vol.3, issue.3, pp.1-43, 2016.
DOI : 10.1145/2110205.2110210
URL : https://hal.archives-ouvertes.fr/inria-00614597

M. Dorier, G. Antoniu, R. Ross, D. Kimpe, and S. Ibrahim, CALCioM: Mitigating I/O Interference in HPC Systems through Cross-Application Coordination, 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp.155-164, 2014.
DOI : 10.1109/IPDPS.2014.27
URL : https://hal.archives-ouvertes.fr/hal-00916091

M. Dorier, S. Ibrahim, G. Antoniu, and R. Ross, Omnisc'IO: A Grammar-Based Approach to Spatial and Temporal I/O Patterns Prediction, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis, pp.623-634, 2014.
DOI : 10.1109/SC.2014.56
URL : https://hal.archives-ouvertes.fr/hal-01025670

J. Ekanayake, S. Pallickara, and G. Fox, MapReduce for Data Intensive Scientific Analyses, 2008 IEEE Fourth International Conference on eScience, pp.277-284, 2008.
DOI : 10.1109/eScience.2008.59
URL : http://grids.ucs.indiana.edu/ptliupages/publications/eScience-submission_Jaliya_final.pdf

Z. Fadika, E. Dede, M. Govindaraju, and L. Ramakrishnan, MARIANE: MApReduce Implementation Adapted for HPC Environments, 2011 IEEE/ACM 12th International Conference on Grid Computing, pp.82-89, 2011.
DOI : 10.1109/Grid.2011.20
URL : http://www.cs.binghamton.edu/~mgovinda/papers/MARIANE-GRID-2011-cameraready.pdf

J. Flich, G. Agosta, P. Ampletzer, D. A. Alonso, C. Brandolese et al., Enabling HPC for QoS-sensitive Applications: The MANGO Approach, Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp.702-707, 2016.
DOI : 10.3850/9783981537079_1019

G. Fox, J. Qiu, S. Jha, S. Ekanayake, and S. Kamburugamuve, Big Data, Simulations and HPC Convergence, Workshop on Big Data Benchmarks, pp.3-17, 2015.
DOI : 10.1109/SC.2012.55

A. Gainaru, G. Aupy, A. Benoit, F. Cappello, Y. Robert et al., Scheduling the I/O of HPC Applications Under Congestion, 2015 IEEE International Parallel and Distributed Processing Symposium, pp.1013-1022, 2015.
DOI : 10.1109/IPDPS.2015.116
URL : https://hal.archives-ouvertes.fr/hal-01251938

S. K. Garg, C. S. Yeo, A. Anandasivam, and R. Buyya, Environment-conscious scheduling of HPC applications on distributed Cloud-oriented data centers, Journal of Parallel and Distributed Computing, vol.71, issue.6, pp.732-749, 2011.
DOI : 10.1016/j.jpdc.2010.04.004

A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker et al., Dominant resource fairness: fair allocation of multiple resource types, International Symposium on Networked Systems Design and Implementation, USENIX, pp.323-336, 2011.

A. Ghodsi, M. Zaharia, S. Shenker, and I. Stoica, Choosy, Proceedings of the 8th ACM European Conference on Computer Systems, EuroSys '13, pp.365-378, 2013.
DOI : 10.1145/2465351.2465387

A. Giraph, Available: https://giraph.apache.org, 2013.

G. Cloud-platform and . Google, Available: https://cloud.google.com/compute, 2017.

. Graphlab, Available: https://turi, 2013.

Y. Guo, W. Bland, P. Balaji, and X. Zhou, Fault tolerant MapReduce-MPI for HPC clusters, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on, SC '15, pp.1-12, 2015.
DOI : 10.1007/978-3-642-25821-3_9
URL : http://dl.acm.org/ft_gateway.cfm?id=2807617&type=pdf

B. He, W. Fang, Q. Luo, N. K. Govindaraju, and T. Wang, Mars, Proceedings of the 17th international conference on Parallel architectures and compilation techniques, PACT '08, pp.260-269, 2008.
DOI : 10.1145/1454115.1454152

I. El-helw, R. Hofman, and H. E. Bal, Scaling MapReduce Vertically and Horizontally, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis, pp.525-535, 2014.
DOI : 10.1109/SC.2014.48

C. Hsu, K. D. Slagter, and Y. Chung, Locality and loading aware virtual machine mapping techniques for optimizing communications in MapReduce applications, Future Generation Computer Systems, vol.53, pp.43-54, 2015.
DOI : 10.1016/j.future.2015.04.006

D. Huang, X. Shi, S. Ibrahim, L. Lu, H. Liu et al., MR-scope, Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC '10, pp.849-855, 2010.
DOI : 10.1145/1851476.1851598

J. Huang, X. Ouyang, J. Jose, M. Wasi-ur-rahman, H. Wang et al., High-Performance Design of HBase with RDMA over InfiniBand, 2012 IEEE 26th International Parallel and Distributed Processing Symposium, pp.2012-774
DOI : 10.1109/IPDPS.2012.74

S. Ibrahim, Performance-aware scheduling for data-intensive cloud computing, 2011.

S. Ibrahim, H. Jin, L. Lu, B. He, G. Antoniu et al., Maestro: Replica-Aware Map Scheduling for MapReduce, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), pp.435-442, 2012.
DOI : 10.1109/CCGrid.2012.122
URL : https://hal.archives-ouvertes.fr/hal-00670813

S. Ibrahim, H. Jin, L. Lu, S. Wu, B. He et al., LEEN: Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud, 2010 IEEE Second International Conference on Cloud Computing Technology and Science, pp.17-24, 2010.
DOI : 10.1109/CloudCom.2010.25

S. Ibrahim, T. Phan, A. Carpen-amarie, H. Chihoub, D. Moise et al., Governing energy consumption in Hadoop through CPU frequency scaling: An analysis, Future Generation Computer Systems, vol.54, pp.219-232, 2016.
DOI : 10.1016/j.future.2015.01.005
URL : https://hal.archives-ouvertes.fr/hal-01166252

S. Ibrahim, T. A. Phuong, and G. Antoniu, An Eye on the Elephant in the Wild: A Performance Evaluation of Hadoop???s Schedulers Under Failures, International Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, pp.141-157, 2015.
DOI : 10.1007/978-3-319-28448-4_11

T. Ilsche, J. Schuchart, J. Cope, D. Kimpe, T. Jones et al., Enabling event tracing at leadership-class scale through I/O forwarding middleware, Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing, HPDC '12, pp.2012-2061
DOI : 10.1145/2287076.2287085

M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly, Dryad: distributed data-parallel programs from sequential building blocks, Special Interest Group on Operating Systems Review, ACM, pp.59-72, 2007.

M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar et al., Quincy, Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, SOSP '09, pp.261-276, 2009.
DOI : 10.1145/1629575.1629601

N. S. Islam, M. W. Rahman, J. Jose, R. Rajachandrasekar, H. Wang et al., High performance RDMA-based design of HDFS over InfiniBand, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis, p.35, 2012.
DOI : 10.1109/SC.2012.65

N. S. Islam, D. Shankar, X. Lu, M. Wasi-ur-rahman, and D. K. Panda, Accelerating I/O Performance of Big Data Analytics on HPC Clusters through RDMA-Based Key-Value Store, 2015 44th International Conference on Parallel Processing, pp.280-289, 2015.
DOI : 10.1109/ICPP.2015.79

N. S. Islam, M. Wasi-ur-rahman, X. Lu, and D. K. Panda, High Performance Design for HDFS with Byte-Addressability of NVM and RDMA, Proceedings of the 2016 International Conference on Supercomputing, ICS '16, pp.1-14, 2016.
DOI : 10.1145/2063384.2063436

W. Jiang, V. T. Ravi, and G. Agrawal, A MapReduce system with an alternate API for multi-core environments, International Conference on Cluster, Cloud and Grid Computing, pp.84-93, 2010.

H. Jin, S. Ibrahim, T. Bell, W. Gao, D. Huang et al., Cloud Types and Services, Handbook of Cloud Computing, pp.335-355, 2010.
DOI : 10.1007/978-1-4419-6524-0_14

A. Jokanovic, J. C. Sancho, G. Rodriguez, A. Lucero, C. Minkenberg et al., Quiet Neighborhoods: Key to Protect Job Performance Predictability, 2015 IEEE International Parallel and Distributed Processing Symposium, pp.449-459, 2015.
DOI : 10.1109/IPDPS.2015.87

A. Kafka, Available: https://kafka.apache.org, 2011.

K. Kc and K. Anyanwu, Scheduling Hadoop Jobs to Meet Deadlines, 2010 IEEE Second International Conference on Cloud Computing Technology and Science, pp.388-392, 2010.
DOI : 10.1109/CloudCom.2010.97

A. Kougkas, M. Dorier, R. Latham, R. Ross, and X. Sun, Leveraging burst buffer coordination to prevent I/O interference, 2016 IEEE 12th International Conference on e-Science (e-Science), pp.371-380, 2016.
DOI : 10.1109/eScience.2016.7870922

S. Kulkarni, N. Bhagat, M. Fu, V. Kedigehalli, C. Kellogg et al., Twitter Heron, Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD '15, pp.239-250, 2015.
DOI : 10.1145/2588555.2595641

C. Kuo, A. Shah, A. Nomura, S. Matsuoka, and F. Wolf, How file access patterns influence interference among cluster applications, 2014 IEEE International Conference on Cluster Computing (CLUSTER), pp.185-193, 2014.
DOI : 10.1109/CLUSTER.2014.6968743

A. Lebre, G. Huard, Y. Denneulin, and P. Sowa, I/O Scheduling Service for Multi-Application Clusters, 2006 IEEE International Conference on Cluster Computing, pp.1-10, 2006.
DOI : 10.1109/CLUSTR.2006.311854
URL : https://hal.archives-ouvertes.fr/hal-00486899

H. Li, A. Ghodsi, M. Zaharia, S. Shenker, and I. Stoica, Tachyon, Proceedings of the ACM Symposium on Cloud Computing, SOCC '14, pp.1-15, 2014.
DOI : 10.1145/2517349.2522737

Z. Li and H. Shen, Designing a Hybrid Scale-Up/Out Hadoop Architecture Based on Performance Measurements for High Application Performance, 2015 44th International Conference on Parallel Processing, pp.21-30, 2015.
DOI : 10.1109/ICPP.2015.11

T. Lippert, D. Mallmann, and M. Riedel, Scientific Big Data analytics by HPC, John von Neumann Institute for Computing Symposium, 2016.

H. Liu and B. He, Reciprocal Resource Fairness: Towards Cooperative Multiple-Resource Fair Sharing in IaaS Clouds, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis, pp.970-981, 2014.
DOI : 10.1109/SC.2014.84

N. Liu, J. Cope, P. Carns, C. Carothers, R. Ross et al., On the role of burst buffers in leadership-class storage systems, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), pp.2012-2013
DOI : 10.1109/MSST.2012.6232369

J. Lofstead, F. Zheng, Q. Liu, S. Klasky, R. Oldfield et al., Managing Variability in the IO Performance of Petascale Storage Systems, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-12, 2010.
DOI : 10.1109/SC.2010.32

I. Lopez, IDC talks convergence in high performance data analysis Available: https://www.datanami.com, 2013.

X. Lu, N. S. Islam, M. Wasi-ur-rahman, J. Jose, H. Subramoni et al., High-Performance Design of Hadoop RPC with RDMA over InfiniBand, 2013 42nd International Conference on Parallel Processing, pp.641-650, 2013.
DOI : 10.1109/ICPP.2013.78

X. Lu, F. Liang, B. Wang, L. Zha, and Z. Xu, DataMPI: Extending MPI to Hadoop-Like Big Data Computing, 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp.829-838, 2014.
DOI : 10.1109/IPDPS.2014.90

D. Luan, S. Huang, and G. Gong, Using Lustre with Apache Hadoop, Sun Microsystems Inc, Tech. Rep, 2009.

Y. Mao, R. Morris, and M. F. Kaashoek, Optimizing MapReduce for multicore architectures, 2010.

A. Marathe, R. Harris, D. Lowenthal, B. R. De-supinski, B. Rountree et al., Exploiting Redundancy and Application Scalability for Cost-Effective, Time-Constrained Execution of HPC Applications on Amazon EC2, International Symposium on High-performance Parallel and Distributed computing, pp.279-290, 2014.
DOI : 10.1109/TPDS.2015.2508457

P. Mell and T. Grance, The NIST definition of cloud computing, Recommendations of the national institute of standards and technology, National Institute of Standards and Technology, 2011.

A. Moody, G. Bronevetsky, K. Mohror, and B. R. Supinski, Design, modeling, and evaluation of a scalable multi-level checkpointing system, International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-11, 2010.

B. Nicolae, J. Bresnahan, K. Keahey, and G. Antoniu, Going back and forth, Proceedings of the 20th international symposium on High performance distributed computing, HPDC '11, pp.147-158, 2011.
DOI : 10.1145/1996130.1996152
URL : https://hal.archives-ouvertes.fr/inria-00570682

Z. Niu, S. Tang, and B. He, Gemini: An Adaptive Performance-Fairness Scheduler for Data-Intensive Cluster Computing, 2015 IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom), pp.66-73, 2015.
DOI : 10.1109/CloudCom.2015.52

M. Pastorelli, M. Dell-'amico, and P. Michiardi, OS-Assisted Task Preemption for Hadoop, 2014 IEEE 34th International Conference on Distributed Computing Systems Workshops, pp.94-99, 2014.
DOI : 10.1109/ICDCSW.2014.24
URL : http://arxiv.org/pdf/1402.2107.pdf

A. Phanishayee, E. Krevat, V. Vasudevan, D. G. Andersen, G. R. Ganger et al., Measurement and analysis of TCP throughput collapse in cluster-based storage systems, Conference on File and Storage Technologies, USENIX, pp.1-14, 2008.

J. Polo, D. Carrera, Y. Becerra, M. Steinder, and I. Whalley, Performance-driven task co-scheduling for MapReduce environments, 2010 IEEE Network Operations and Management Symposium, NOMS 2010, pp.373-380, 2010.
DOI : 10.1109/NOMS.2010.5488494

L. Popa, G. Kumar, M. Chowdhury, A. Krishnamurthy, S. Ratnasamy et al., FairCloud: sharing the network in cloud computing, International Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, pp.2012-187

R. Power and J. Li, Piccolo: building fast, distributed programs with partitioned tables, International Symposium on Operating Systems Design and Implementation, USENIX, pp.1-14, 2010.

Y. Qian, E. Barton, T. Wang, N. Puntambekar, and A. Dilger, A??Novel network request scheduler for a??large scale storage system, Computer Science - Research and Development, vol.7, issue.10, pp.143-148, 2009.
DOI : 10.1007/s00450-009-0073-9
URL : http://wiki.lustre.org/images/2/22/A_Novel_Network_Request_Scheduler_for_a_Large_Scale_Storage_System.pdf

J. Quiané-ruiz, C. Pinkel, J. Schad, and J. Dittrich, RAFTing MapReduce: Fast recovery on the RAFT, 2011 IEEE 27th International Conference on Data Engineering, pp.589-600, 2011.
DOI : 10.1109/ICDE.2011.5767877

K. Ren, Y. Kwon, M. Balazinska, and B. Howe, Hadoop's adolescence, Proceedings of the VLDB Endowment, vol.6, issue.10, pp.853-864, 2013.
DOI : 10.14778/2536206.2536213

A. Rosà, L. Y. Chen, R. Birke, and W. Binder, Demystifying Casualties of Evictions in Big Data Priority Scheduling, ACM SIGMETRICS Performance Evaluation Review, vol.42, issue.4, pp.12-21, 2015.
DOI : 10.1145/2371536.2371562

R. B. Ross and R. Thakur, PVFS: a parallel file system for Linux clusters, Annual Linux Showcase and Conference, pp.391-430, 2000.

K. Salem and H. Garcia-molina, Checkpointing memory-resident databases, [1989] Proceedings. Fifth International Conference on Data Engineering, pp.452-462, 1989.
DOI : 10.1109/ICDE.1989.47249

A. Samza, Available: https://samza.apache.org, 2014.

T. Sandholm and K. Lai, Dynamic Proportional Share Scheduling in Hadoop, Job Scheduling Strategies for Parallel Processing, pp.110-131, 2010.
DOI : 10.1109/HPCA.2007.346181

K. Sato, K. Mohror, A. Moody, T. Gamblin, B. R. De-supinski et al., A User-Level InfiniBand-Based File System and Checkpoint Strategy for Burst Buffers, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp.21-30, 2014.
DOI : 10.1109/CCGrid.2014.24

J. Schad, J. Dittrich, and J. Quiané-ruiz, Runtime measurements in the cloud, Proceedings of the VLDB Endowment, vol.3, issue.1-2, pp.460-471, 2010.
DOI : 10.14778/1920841.1920902

B. Schroeder and G. A. Gibson, Understanding failures in petascale computers, Journal of Physics: Conference Series, vol.78, issue.1, pp.12-22, 2007.
DOI : 10.1088/1742-6596/78/1/012022
URL : http://iopscience.iop.org/article/10.1088/1742-6596/78/1/012022/pdf

P. Schwan, Lustre: building a file system for 1000-node clusters, Annual Linux Symposium, pp.380-386, 2003.

S. Sehrish, G. Mackey, J. Wang, and J. Bent, MRAP, Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC '10, pp.107-118, 2010.
DOI : 10.1145/1851476.1851490

H. Shan and J. Shalf, Using IOR to analyze the I/O performance for HPC platforms, Tech. Rep, 2007.

E. Smirni and D. A. Reed, Lessons from characterizing the input/output behavior of parallel scientific applications, Performance Evaluation, vol.33, issue.1, pp.27-44, 1998.
DOI : 10.1016/S0166-5316(98)00009-1

H. Song, Y. Yin, X. Sun, R. Thakur, and S. Lang, Server-side I/O coordination for parallel file systems, Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on, SC '11, pp.1-11, 2011.
DOI : 10.1145/2063384.2063407
URL : http://www.mcs.anl.gov/%7Ethakur/papers/sc11-io.pdf

A. Spark-primer and . Databricks, Available: http://go.databricks.com/hubfs/pdfs, 2017.

A. Storm, Available: https://storm.apache.org, 2012.

S. Tang, B. Lee, B. He, and H. Liu, Long-term resource fairness, Proceedings of the 28th ACM international conference on Supercomputing, ICS '14, pp.251-260, 2014.
DOI : 10.1145/2597652.2597672

Y. Tanimura, R. Filgueira, I. Kojima, and M. Atkinson, Reservation-based I/O performance guarantee for MPI-IO applications using shared storage systems, International Conference for High Performance Computing, Networking, Storage and Analysis, pp.2012-1382
DOI : 10.1109/sc.companion.2012.204

W. Tantisiriroj, S. W. Son, S. Patil, S. J. Lang, G. Gibson et al., On the duality of data-intensive file system design, Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on, SC '11, pp.1-12, 2011.
DOI : 10.1145/2063384.2063474

S. Thapaliya, P. Bangalore, J. Lofstead, K. Mohror, and A. Moody, Managing I/O Interference in a Shared Burst Buffer System, 2016 45th International Conference on Parallel Processing (ICPP), pp.416-425, 2016.
DOI : 10.1109/ICPP.2016.54

A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka et al., Hive, Proceedings of the VLDB Endowment, vol.2, issue.2, pp.1626-1629, 2009.
DOI : 10.14778/1687553.1687609

R. Tous, A. Gounaris, C. Tripiana, J. Torres, S. Girona et al., Spark deployment and performance evaluation on the MareNostrum supercomputer, 2015 IEEE International Conference on Big Data (Big Data), pp.299-306, 2015.
DOI : 10.1109/BigData.2015.7363768

R. Tudoran, A. Costan, G. Antoniu, and H. Soncu, TomusBlobs: Towards Communication-Efficient Storage for MapReduce Applications in Azure, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), pp.427-434, 2012.
DOI : 10.1109/CCGrid.2012.104
URL : https://hal.archives-ouvertes.fr/hal-00670725

V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar et al., Apache Hadoop YARN, Proceedings of the 4th annual Symposium on Cloud Computing, SOCC '13, pp.1-16, 2013.
DOI : 10.1145/2523616.2523633

C. Vecchiola, R. N. Calheiros, D. Karunamoorthy, and R. Buyya, Deadline-driven provisioning of resources for scientific applications in hybrid clouds with Aneka, Future Generation Computer Systems, vol.28, issue.1, pp.58-65, 2012.
DOI : 10.1016/j.future.2011.05.008

S. Venkataraman, A. Panda, G. Ananthanarayanan, M. J. Franklin, and I. Stoica, The power of choice in data-aware cluster scheduling, International Conference on Operating Systems Design and Implementation, USENIX Association, pp.301-316, 2014.

K. V. Vishwanath and N. Nagappan, Characterizing cloud computing hardware reliability, Proceedings of the 1st ACM symposium on Cloud computing, SoCC '10, pp.193-204, 2010.
DOI : 10.1145/1807128.1807161
URL : http://research.microsoft.com/pubs/120439/socc088-vishwanath.pdf

V. Vishwanath, M. Hereld, K. Iskra, D. Kimpe, V. Morozov et al., Accelerating I/O forwarding in IBM Blue Gene/P systems, International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-10, 2010.

T. Wang, S. Oral, M. Pritchard, B. Wang, and W. Yu, TRIO: Burst Buffer Based I/O Orchestration, 2015 IEEE International Conference on Cluster Computing, pp.194-203, 2015.
DOI : 10.1109/CLUSTER.2015.38

T. Wang, S. Oral, Y. Wang, B. Settlemyer, S. Atchley et al., BurstMem: A high-performance burst buffer system for scientific applications, 2014 IEEE International Conference on Big Data (Big Data), pp.71-79, 2014.
DOI : 10.1109/BigData.2014.7004215

Y. Wang, R. Goldstone, W. Yu, and T. Wang, Characterization and Optimization of Memory-Resident MapReduce on HPC Systems, 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp.799-808, 2014.
DOI : 10.1109/IPDPS.2014.87

Y. Wang, J. Tan, W. Yu, X. Meng, and L. Zhang, Preemptive reducetask scheduling for fair and fast job completion, International Conference on Autonomic Computing, pp.279-289, 2013.

Y. Wang, G. Agrawal, T. Bicer, and W. Jiang, Smart, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on, SC '15, pp.1-12, 2015.
DOI : 10.1109/SC.Companion.2012.114
URL : https://hal.archives-ouvertes.fr/hal-01562745

R. S. Xin, J. E. Gonzalez, M. J. Franklin, and I. Stoica, GraphX, First International Workshop on Graph Data Management Experiences and Systems, GRADES '13, pp.1-6, 2013.
DOI : 10.1145/2484425.2484427

P. Xuan, J. Denton, P. K. Srimani, R. Ge, and F. Luo, Big data analytics on traditional HPC infrastructure using two-level storage, Proceedings of the 2015 International Workshop on Data-Intensive Scalable Computing Systems, DISCS '15, pp.1-8, 2015.
DOI : 10.1109/IPDPS.2014.87

O. Yildiz, M. Dorier, S. Ibrahim, R. Ross, and G. Antoniu, On the Root Causes of Cross-Application I/O Interference in HPC Storage Systems, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp.750-759, 2016.
DOI : 10.1109/IPDPS.2016.50
URL : https://hal.archives-ouvertes.fr/hal-01270630

O. Yildiz, S. Ibrahim, and G. Antoniu, Enabling fast failure recovery in shared Hadoop clusters: Towards failure-aware scheduling, Future Generation Computer Systems, vol.74, pp.208-219, 2016.
DOI : 10.1016/j.future.2016.02.015
URL : https://hal.archives-ouvertes.fr/hal-01338336

O. Yildiz, S. Ibrahim, T. A. Phuong, and G. Antoniu, Chronos: Failure-aware scheduling in shared Hadoop clusters, 2015 IEEE International Conference on Big Data (Big Data), pp.313-318, 2015.
DOI : 10.1109/BigData.2015.7363770
URL : https://hal.archives-ouvertes.fr/hal-01203001

O. Yildiz, A. C. Zhou, and S. Ibrahim, Eley: On the Effectiveness of Burst Buffers for Big Data Processing in HPC Systems, 2017 IEEE International Conference on Cluster Computing (CLUSTER), pp.2017-87
DOI : 10.1109/CLUSTER.2017.73
URL : https://hal.archives-ouvertes.fr/hal-01570737

R. M. Yoo, A. Romano, and C. Kozyrakis, Phoenix rebirth: Scalable MapReduce on a large-scale shared-memory system, 2009 IEEE International Symposium on Workload Characterization (IISWC), pp.198-207, 2009.
DOI : 10.1109/IISWC.2009.5306783
URL : http://csl.stanford.edu/%7Echristos/publications/2009.scalable_phoenix.iiswc.pdf

Z. Yu, M. Li, X. Yang, H. Zhao, and X. Li, Taming Non-local Stragglers Using Efficient Prefetching in MapReduce, 2015 IEEE International Conference on Cluster Computing, pp.52-61, 2015.
DOI : 10.1109/CLUSTER.2015.16

M. Zaharia, D. Borthakur, J. S. Sarma, K. Elmeleegy, S. Shenker et al., Delay scheduling, Proceedings of the 5th European conference on Computer systems, EuroSys '10, pp.265-278, 2010.
DOI : 10.1145/1755913.1755940

M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma et al., Resilient Distributed Datasets, International Conference on Networked Systems Design and Implementation, USENIX, pp.2012-2027
DOI : 10.1145/2886107.2886110

M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, Spark: cluster computing with working sets, International Workshop on Hot Topics in Cloud Computing, USENIX, pp.1-7, 2010.

F. , Z. Boito, R. Kassick, P. O. Navaux, and Y. Denneulin, AGIOS: application-guided I/O scheduling for parallel file systems, International Conference on Parallel and Distributed Systems, pp.43-50, 2013.
DOI : 10.1109/icpads.2013.19

X. Zhang, K. Davis, and S. Jiang, IOrchestrator: Improving the Performance of Multi-node I/O Systems via Inter-Server Coordination, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-11, 2010.
DOI : 10.1109/SC.2010.30

F. Zheng, H. Abbasi, C. Docan, J. Lofstead, Q. Liu et al., PreDatA – preparatory data analytics on peta-scale machines, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp.1-12, 2010.
DOI : 10.1109/IPDPS.2010.5470454

Z. Zhou, X. Yang, D. Zhao, P. Rich, W. Tang et al., I/O-Aware Batch Scheduling for Petascale Computing Systems, 2015 IEEE International Conference on Cluster Computing, pp.254-263, 2015.
DOI : 10.1109/CLUSTER.2015.45

S. Ibrahim and G. Antoniu, Enabling Fast Failure Recovery in Shared Hadoop Clusters : Towards Failure-Aware Scheduling, Chapter Journal of the Future Generation Computer Systems, p.2016
URL : https://hal.archives-ouvertes.fr/hal-01338336

O. @bullet-matthieu-dorier, S. Yildiz, A. Ibrahim, G. Orgerie, and . Antoniu, On the Energy Footprint of I/O Management in Exascale HPC Systems, Journal of the Future Generation Computer Systems (FGCS), 2016.

G. @bullet-matthieu-dorier, F. Antoniu, M. Cappello, R. Snir, O. Sisneros et al., Damaris : Addressing Performance Variability in Data Management for Post-Petascale Simulations, ACM Transactions on Parallel Computing, p.2016

I. Conferences, @. Orcun-yildiz, A. C. Zhou, and S. Ibrahim, Eley : On the Effectiveness of Burst Buffers for Big Data Processing in HPC systems, Proceedings of the 2017 IEEE International Conference on Cluster Computing (CLUSTER '17), Hawaii, 2017.

M. @bullet-orcun-yildiz, S. Dorier, R. Ibrahim, G. Ross, and . Antoniu, On the Root Causes of Cross-Application I/O Interference in HPC Storage Systems, Proceedings of the 2016 IEEE International Parallel & Distributed Processing Symposium (IPDPS '16), 2016.

S. @bullet-orcun-yildiz, T. A. Ibrahim, G. Phuong, and . Antoniu, Chronos : Failure- Aware Scheduling in Shared Hadoop Clusters, Proceedings of the 2015 IEEE International Conference on Big Data (BigData '15), 2015.

I. Workshops, . Conferences, M. @bullet-orcun-yildiz, S. Dorier, G. Ibrahim et al., A Performance and Energy Analysis of I/O Management Approaches for Exascale Systems, Proceedings of the 2014 Data-Intensive Distributed Computing (DIDC '14) workshop, held in conjunction with the 23 rd International ACM Symposium on High Performance Parallel and Distributed Computing (HPDC '14), 2014.