D. J. Abadi, S. R. Madden, and N. Hachem, « Column-stores vs. rowstores: how different are they really?, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pp.967-980, 2008.

A. Abouzeid, K. B. Pawlikowski, D. J. Abadi, A. Rasin, A. Silberschatz et al., an architectural hybrid of mapreduce and dbms technologies for analytical workloads, Proceedings of the VLDB Endowment, pp.922-933, 2009.

S. Agrawal, S. Chaudhuri, and V. R. Narasayya, « Automated selection of materialized views and indexes in sql databases, Proceedings of the 26th International Conference on Very Large Data Bases, pp.496-505, 2000.

S. Agrawal, V. R. Narasayya, and B. Yang, Integrating vertical and horizontal partitioning into automated physical database design, Proceedings of the 2004 ACM SIGMOD international conference on Management of data , SIGMOD '04, pp.359-370, 2004.
DOI : 10.1145/1007568.1007609

R. K. Ahuja, Ö. Ergun, J. B. Orlin, and A. P. Punnen, A survey of very large-scale neighborhood search techniques, Discrete Applied Mathematics, vol.123, issue.1-3, pp.1-3, 2002.
DOI : 10.1016/S0166-218X(01)00338-9

A. Ailamaki, V. Kantere, and D. Dash, Managing scientific data, Communications of the ACM, vol.53, issue.6, pp.68-78, 2009.
DOI : 10.1145/1743546.1743568

E. Anderson, J. Tucek, and !. Efficiency, Efficiency matters!, ACM SIGOPS Operating Systems Review, vol.44, issue.1, pp.40-45, 2010.
DOI : 10.1145/1740390.1740400

P. A. Bernstein, V. Hadzilacos, and N. Goodman, Concurrency control and recovery in database systems, 1987.

E. Boman, K. Devine, L. A. Fisk, R. Heaphy, B. Hendrickson et al., Zoltan 3.0: parallel partitioning, load-balancing, and data management services; user's guide, 2007.

P. Boncz, M. Zukowski, N. Nes, and . Monetdb, X100: hyper-pipelining query execution, Second Biennial Conference on Innovative Data Systems Research, pp.225-237, 2005.

Y. Bu, B. Howe, M. Balazinska, and M. Ernst, The HaLoop approach to large-scale iterative data analysis, The VLDB Journal, vol.7, issue.1, pp.169-190, 2012.
DOI : 10.1007/s00778-012-0269-7

M. Burrows, « The Chubby lock service for loosely-coupled distributed systems, Proceedings of the 7th symposium on Operating systems design and implementation, pp.335-350, 2006.

N. Capit and J. Emeras, OAR documentation -user guide, 2012.

F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach et al., Bigtable, distributed storage system for structured data, pp.1-26, 2008.
DOI : 10.1145/1365815.1365816

S. Chaudhuri and V. Narasayya, « Self-tuning database systems: a decade of progress, Proceedings of the 33rd international conference on Very large data bases, pp.3-14, 2007.

S. Chaudhuri, V. R. Narasayya, and . Autoadmin, what-if" index analysis utility, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, pp.367-378, 1998.

G. Chockler, I. Keidar, and R. Vitenberg, Group communication specifications: a comprehensive study, ACM Computing Surveys, vol.33, issue.4, pp.427-469, 2001.
DOI : 10.1145/503112.503113

B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon et al., Yahoo !'s hosted data serving platform, Proceedings of the VLDB Endowment, pp.1277-1288, 2008.

G. P. Copeland and S. N. Khoshafian, « A decomposition storage model, Proceedings of the 1985 ACM SIGMOD international conference on Management of data, pp.268-279, 1985.

G. Decandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman et al., Amazon's highly available key-value store, Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, pp.205-220, 2007.

K. D. Devine, E. G. Boman, R. T. Heaphy, B. A. Hendrickson, J. D. Teresco et al., New challenges in dynamic load balancing, New challenges in dynamic load balancing, pp.133-152, 2005.
DOI : 10.1016/j.apnum.2004.08.028

D. J. Dewitt, S. Ghandeharizadeh, D. A. Schneider, A. Bricker, H. I. Hsaio et al., The Gamma database machine project, The Gamma database machine project, pp.44-62, 1990.
DOI : 10.1109/69.50905

D. J. Dewitt and J. Gray, Parallel database systems: the future of high performance database systems, Communications of the ACM, vol.35, issue.6, pp.85-98, 1992.
DOI : 10.1145/129888.129894

D. J. Dewitt, M. Smith, and H. Boral, A single-user performance evaluation of the teradata database machine, High Performance Transaction Systems, pp.243-276, 1989.
DOI : 10.1007/3-540-51085-0_50

J. Dittrich, J. A. Ruiz, A. Jindal, Y. Kargin, V. Setty et al., making a yellow elephant run like a cheetah (without it even noticing), Proceedings of the VLDB Endowment, pp.515-529

C. M. Fiduccia and R. M. Mattheyses, « A linear-time heuristic for improving network partitions, Proceedings of the 19th Design Automation Conference, pp.175-181, 1982.

I. Foster, What is the grid? -a three point checklist, 2002.

I. Foster, C. Kesselman, and S. Tuecke, « The anatomy of the Grid enabling scalable virtual organizations, International Journal of Supercomputer Applications, vol.15, 2001.

F. Pellegrini and P. , Scotch and libPTScotch 6.0. User's Guide, 2012.

E. Friedman, P. Pawlowski, J. Cieslewicz, and «. Sql, MapReduce: a practical approach to self-describing, polymorphic, and parallelizable userdefined functions, Proceedings of the VLDB Endowment, pp.1402-1413, 2009.

S. Ghandeharizadeh and D. J. Dewitt, « Hybrid-range partitioning strategy: a new declustering strategy for multiprocessor databases machines, Proceedings of the sixteenth international conference on Very large databases, pp.481-492, 1990.

B. Gufler, N. Augsten, A. Reiser, and A. Kemper, Load Balancing in MapReduce Based on Scalable Cardinality Estimates, 2012 IEEE 28th International Conference on Data Engineering, pp.522-533, 2012.
DOI : 10.1109/ICDE.2012.58

B. Gufler, N. Augsten, A. Reiser, and A. Kemper, « Handling data skew in MapReduce, Proceedings of the 1st International Conference on Cloud Computing and Services, pp.574-583, 2011.

M. Hammoud, M. S. Rehman, and M. F. Sakr, Center-of-Gravity Reduce Task Scheduling to Lower MapReduce Network Traffic, 2012 IEEE Fifth International Conference on Cloud Computing, pp.49-58, 2012.
DOI : 10.1109/CLOUD.2012.92

Y. He, R. Lee, Y. Huai, Z. Shao, N. Jain et al., RCFile: A fast and space-efficient data placement structure in MapReduce-based warehouse systems, 2011 IEEE 27th International Conference on Data Engineering, pp.1199-1208, 2011.
DOI : 10.1109/ICDE.2011.5767933

B. Hendrickson and T. G. Kolda, Graph partitioning models for parallel computing, Parallel Computing, vol.26, issue.12, pp.1519-1534, 2000.
DOI : 10.1016/S0167-8191(00)00048-X

B. Hendrickson and R. Leland, A multilevel algorithm for partitioning graphs, Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM) , Supercomputing '95, 1995.
DOI : 10.1145/224170.224228

M. Herlihy, J. Wing, and . Linearizability, Linearizability: a correctness condition for concurrent objects, ACM Transactions on Programming Languages and Systems, vol.12, issue.3, pp.463-492, 1990.
DOI : 10.1145/78969.78972

P. Hunt, M. Konar, F. Junqueira, B. Reed, and . Zookeeper, wait-free coordination for internet-scale systems, Proceedings of the 2010 USENIX conference on USENIX annual technical conference, p.11, 2010.

S. Ibrahim, H. Jin, L. Lu, S. Wu, B. He et al., LEEN: Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud, 2010 IEEE Second International Conference on Cloud Computing Technology and Science, pp.17-24, 2010.
DOI : 10.1109/CloudCom.2010.25

E. Jeanvoine, L. Sarzyniec, and L. Nussbaum, efficient and scalable operating system provisioning, pp.38-44, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00909111

D. Jiang, B. C. Ooi, L. Shi, and S. Wu, The performance of MapReduce, Proceedings of the VLDB Endowment, pp.472-483, 2010.
DOI : 10.14778/1920841.1920903

A. Jindal, J. A. Ruiz, and J. Dittrich, Trojan data layouts, Proceedings of the 2nd ACM Symposium on Cloud Computing, SOCC '11, 2011.
DOI : 10.1145/2038916.2038937

«. H. Abadi, -store: a high-performance, distributed main memory transaction processing system, Proceedings of the VLDB Endowment, pp.1496-1499, 2008.
URL : https://hal.archives-ouvertes.fr/hal-01278543

D. Karger, E. Lehman, T. Leighton, R. Panigrahy, M. Levine et al., Consistent hashing and random trees, Proceedings of the twenty-ninth annual ACM symposium on Theory of computing , STOC '97, pp.654-663, 1997.
DOI : 10.1145/258533.258660

G. Karypis, METIS : a software package for partitioning unstructured graphs, partitioning meshes, and computing fill-reducing orderings of sparse matrices. Version 5.1.0, 2013.

B. W. Kernighan and S. Lin, An Efficient Heuristic Procedure for Partitioning Graphs, The Bell System Technical Journal, pp.291-307, 1970.
DOI : 10.1002/j.1538-7305.1970.tb01770.x

W. Kohler and K. Steiglitz, Characterization and Theoretical Comparison of Branch-and-Bound Algorithms for Permutation Problems, Journal of the ACM, vol.21, issue.1, pp.140-156, 1974.
DOI : 10.1145/321796.321808

M. Koyutürk and C. Aykanat, Iterative-improvement-based declustering heuristics for multi-disk databases, Information Systems, vol.30, issue.1, pp.47-70, 2005.
DOI : 10.1016/j.is.2003.08.003

A. Lakshman, P. Malik, and . Cassandra, Cassandra, decentralized structured storage system, pp.35-40, 2010.
DOI : 10.1145/1773912.1773922

C. A. Lee, A perspective on scientific cloud computing, Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC '10, pp.451-459, 2010.
DOI : 10.1145/1851476.1851542

Y. Li and M. Mascagni, « Improving performance via computational replication on a large-scale computational grid, Proceedings of the 3st International Symposium on Cluster Computing and the Grid, 2003.

J. Lin, « The curse of zipf and limits to parallelization: a look at the stragglers problem in MapReduce, Proceedings of the 7th Workshop on LargeScale Distributed Systems for Information Retrieval, 2009.

D. R. Liu and S. Shekhar, Partitioning similarity graphs: A framework for declustering problems, Information Systems, vol.21, issue.6, pp.475-496, 1996.
DOI : 10.1016/0306-4379(96)00024-5

R. V. Nehme and N. Bruno, Automated partitioning design in parallel database systems, Proceedings of the 2011 international conference on Management of data, SIGMOD '11, pp.1137-1148, 2011.
DOI : 10.1145/1989323.1989444

D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman et al., The Eucalyptus Open-Source Cloud-Computing System, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, pp.124-131, 2009.
DOI : 10.1109/CCGRID.2009.93

C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins, Pig latin, Proceedings of the 2008 ACM SIGMOD international conference on Management of data , SIGMOD '08, pp.1099-1110, 2008.
DOI : 10.1145/1376616.1376726

M. T. Özsu and P. Valduriez, Principles of Distributed Database Systems, 3rd, 2011.

B. Palanisamy, A. Singh, L. Liu, B. Jain, and . Purlieus, locality-aware resource allocation for MapReduce in a cloud, Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, 2011.

S. Papadomanolakis, A. Ailamaki, and . Autopart, AutoPart: automating schema design for large scientific databases using data partitioning, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004., pp.383-392, 2004.
DOI : 10.1109/SSDM.2004.1311234

A. Pavlo, C. Curino, and S. B. Zdonik, Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems, Proceedings of the 2012 international conference on Management of Data, SIGMOD '12, pp.61-72, 2012.
DOI : 10.1145/2213836.2213844

A. Pavlo, E. Paulson, A. Rasin, D. J. Abadi, D. J. Dewitt et al., A comparison of approaches to large-scale data analysis, Proceedings of the 35th SIGMOD international conference on Management of data, SIGMOD '09, pp.165-178, 2009.
DOI : 10.1145/1559845.1559865

R. Pike, S. Dorward, R. Griesemer, and S. Quinlan, Interpreting the Data: Parallel Analysis with Sawzall, Scientific Programming, pp.277-298, 2005.
DOI : 10.1155/2005/962135

S. Ramakrishnan, G. Swart, and A. Urmanov, Balancing reducer skew in MapReduce workloads using progressive sampling, Proceedings of the Third ACM Symposium on Cloud Computing, SoCC '12, 2012.
DOI : 10.1145/2391229.2391245

K. Ranganathan, I. Iamnitchi, and . Foster, Improving Data Availability through Dynamic Model-Driven Replication in Large Peer-to-Peer Communities, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02), p.376, 2002.
DOI : 10.1109/CCGRID.2002.1017164

J. Rao, C. Zhang, N. Megiddo, and G. M. Lohman, Automating physical database design in a parallel database, Proceedings of the 2002 ACM SIGMOD international conference on Management of data , SIGMOD '02, pp.558-569, 2002.
DOI : 10.1145/564691.564757

L. F. Sarmenta, « Sabotage-tolerance mechanisms for volunteer computing systems, Proceedings of the First IEEE/ACM International Symposium on Cluster Computing and the Grid, pp.337-346, 2001.

P. Schwan and . Lustre, building a file system for 1,000-node clusters, Linux Symposium, p.380, 2003.

S. Seo, I. Jang, K. Woo, I. Kim, J. Kim et al., HPMR: Prefetching and pre-shuffling in shared MapReduce computation environment, 2009 IEEE International Conference on Cluster Computing and Workshops, pp.1-8, 2009.
DOI : 10.1109/CLUSTR.2009.5289171

D. Silva, W. Cirne, and F. Brasileiro, Trading Cycles for Information: Using Replication to Schedule Bag-of-Tasks Applications on Computational Grids, Euro-Par 2003 Parallel Processing, pp.169-180, 2003.
DOI : 10.1007/978-3-540-45209-6_26

M. Stonebraker, D. Abadi, D. Dewitt, S. Madden, E. Paulson et al., MapReduce and parallel DBMSs, MapReduce and parallel DBMSs: friends or foes?, pp.64-71, 2010.
DOI : 10.1145/1629175.1629197

X. Su and G. Swart, Oracle in-database hadoop, Proceedings of the 2012 international conference on Management of Data, SIGMOD '12, pp.779-790, 2012.
DOI : 10.1145/2213836.2213955

V. Ümit, C. Çatalyürek, and . Aykanat, PaToH: partitioning tool for hypergraphs, 2011.

G. Valentin, M. Zuliani, D. Zilio, G. Lohman, and A. Skelley, DB2 advisor: an optimizer smart enough to recommend its own indexes, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073), 2000.
DOI : 10.1109/ICDE.2000.839397

R. Vernica, A. Balmin, K. S. Beyer, and V. Ercegovac, Adaptive MapReduce using situation-aware mappers, Proceedings of the 15th International Conference on Extending Database Technology, EDBT '12, pp.420-431, 2012.
DOI : 10.1145/2247596.2247646

W. Vogels, Eventually consistent, Communications of the ACM, vol.52, issue.1, pp.40-44, 2009.
DOI : 10.1145/1435417.1435432

C. Walton, A. Dale, and R. Jenevein, « A taxonomy and performance model of data skew effects in parallel joins, Proceedings of the 17th International Conference on Very Large Data Bases, pp.537-548, 1991.

G. Wang, A. R. Butt, P. Pandey, and K. Gupta, « A simulation approach to evaluating design decisions in MapReduce setups, 17th Annual Meeting of the IEEE/ACM International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, pp.1-11, 2009.

Y. Xu, P. Zou, W. Qu, Z. Li, K. Li et al., Sampling-Based Partitioning in MapReduce for Skewed Data, 2012 Seventh ChinaGrid Annual Conference, pp.1-8, 2012.
DOI : 10.1109/ChinaGrid.2012.18

J. Zhou, N. Bruno, M. Wu, P. Larson, R. Chaiken et al., SCOPE: parallel databases meet MapReduce, Parallel Databases Meet MapReduce, pp.611-636, 2012.
DOI : 10.1007/s00778-012-0280-z