J. L. Abellán, J. Fernández, and M. E. Acacio, Glocks: ecient support for highly-contended locks in many-core CMPs, Proceedings of the 2011 IEEE International Parallel and Distributed Processing Symposium, IPDPS '11, pp.893-905, 2011.

A. Agarwal and M. Cherian, Adaptive backo synchronization techniques, Proceedings of the 16th Annual International Symposium on Computer Architecture, ISCA '89, pp.396-406, 1989.

H. Akkan, M. Lang, and L. Ionkov, HPC runtime support for fast and power ecient locking and synchronization, Proceedings of the 2013 IEEE International Conference on Cluster Computing, CLUSTER '13, pp.1-7, 2013.

T. E. Anderson, The performance of spin lock alternatives for shared-money multiprocessors, IEEE Transactions on Parallel and Distributed Systems, vol.1, issue.1, pp.6-16, 1990.
DOI : 10.1109/71.80120

G. R. Andrews, Concurrent Programming: principles and Practice, 1991.

M. Auslander, D. Edelsohn, O. Krieger, B. Rosenburg, and R. Wisniewski, Enhancement to the MCS lock for increased functionality and improved programmability. U.S. patent application 10, p.745, 2003.

D. F. Bacon, R. Konuru, C. Murthy, and M. Serrano, Thin locks: featherweight synchronization for java, Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation, PLDI '98, pp.258-268, 1998.

A. Baumann, P. Barham, P. Dagand, T. Harris, R. Isaacs et al., The multikernel, Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, SOSP '09, pp.29-44, 2009.
DOI : 10.1145/1629575.1629579

B. N. Bershad, T. E. Anderson, E. D. Lazowska, and H. M. Levy, Lightweight remote procedure call, ACM Transactions on Computer Systems, vol.8, issue.1, pp.37-55, 1990.
DOI : 10.1145/77648.77650
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.361.2695

B. N. Bershad, T. E. Anderson, E. D. Lazowska, and H. M. Levy, URPC: a toolkit for prototyping remote procedure calls, The Computer Journal, vol.39, issue.6, pp.525-540, 1996.

L. Boguslavsky, K. Harzallah, A. Kreinen, K. Sevcik, and A. Vainshtein, Optimal Strategies for Spinning and Blocking, Journal of Parallel and Distributed Computing, vol.21, issue.2, pp.246-254, 1994.
DOI : 10.1006/jpdc.1994.1056

S. Boyd-wickizer, H. Chen, R. Chen, Y. Mao, F. Kaashoek et al., Corey: an operating system for many cores, Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI '08, pp.43-57, 2008.

A. T. Boyd-wickizer, Y. Clements, A. Mao, M. F. Pesterev, R. Kaashoek et al., An analysis of linux scalability to many cores, Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation, OSDI '10, 2010.

S. Boyd-wickizer, M. F. Kaashoek, R. Morris, and N. Zeldovich, Non-scalable locks are dangerous, Proceedings of the 13th Ottawa Linux Symposium, OLS '13, 2012.

B. B. Brandenburg, Improved analysis and evaluation of real-time semaphore protocols for P-FP scheduling, 2013 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS), pp.141-152, 2013.
DOI : 10.1109/RTAS.2013.6531087

A. Brodsky, F. Ellen, and P. Woelfel, Fully-adaptive algorithms for long-lived renaming, Proceedings of the 20th International Conference on Distributed Computing, DISC '06, pp.413-427, 2006.
DOI : 10.1007/s00446-011-0137-5
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.84.6011

S. Browne, J. Dongarra, N. Garner, G. Ho, and P. Mucci, A Portable Programming Interface for Performance Evaluation on Modern Processors, International Journal of High Performance Computing Applications, vol.14, issue.3, pp.189-204, 2000.
DOI : 10.1177/109434200001400303

A. Burns and A. J. Wellings, Locking policies for multiprocessor ada, ACM SIGAda Ada Letters, vol.33, issue.2, pp.59-65, 2013.
DOI : 10.1145/2552999.2553006

A. Burns and A. J. Wellings, A Schedulability Compatible Multiprocessor Resource Sharing Protocol -- MrsP, 2013 25th Euromicro Conference on Real-Time Systems, pp.282-291, 2013.
DOI : 10.1109/ECRTS.2013.37

I. Calciu, D. Dice, T. Harris, M. Herlihy, A. Kogan et al., Message Passing or Shared Memory: Evaluating the Delegation Abstraction for Multicores, Proceedings of the 17th International Conference on Principles of Distributed Systems, OPODIS '13, pp.83-97, 2013.
DOI : 10.1007/978-3-319-03850-6_7

J. S. Chase, H. M. Levy, M. J. Feeley, and E. D. Lazowska, Sharing and protection in a single-address-space operating system, ACM Transactions on Computer Systems, vol.12, issue.4, pp.271-307, 1994.
DOI : 10.1145/195792.195795

J. Cleary, O. Callanan, M. Purcell, and D. Gregg, Fast asymmetric thread synchronization, ACM Transactions on Architecture and Code Optimization, vol.9, issue.4, pp.1-2722, 2013.
DOI : 10.1145/2400682.2400686

P. Conway, N. Kalyanasundharam, G. Donley, K. Lepak, and B. Hughes, Cache Hierarchy and Memory Subsystem of the AMD Opteron Processor, IEEE Micro, vol.30, issue.2, pp.16-29, 2010.
DOI : 10.1109/MM.2010.31

T. S. Craig, Building FIFO and priority-queueing spin locks from atomic swap, 2003.

D. Interactive, Memcached: distributed memory object caching system

M. Dashti, A. Fedorova, J. Funston, F. Gaud, R. Lachaize et al., Trac management: a holistic approach to memory placement on NUMA systems, Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '13, pp.381-394, 2013.

F. David, G. Thomas, L. Lawall, J. , and G. Muller, Continuously measuring critical section pressure with the free lunch profiler, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00957154

T. David, R. Guerraoui, and V. Trigonakis, Everything you always wanted to know about synchronization but were afraid to ask, Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP '13, pp.33-48, 2013.
DOI : 10.1145/2517349.2522714

J. Dean and S. Ghemawat, MapReduce, Communications of the ACM, vol.51, issue.1, pp.107-113, 2008.
DOI : 10.1145/1327452.1327492

D. Dice, Polite busy-waiting with wrpause on sparc, 2012.

D. Dice, V. J. Marathe, and N. Shavit, Flat-combining NUMA locks, Proceedings of the 23rd ACM symposium on Parallelism in algorithms and architectures, SPAA '11, pp.65-74, 2011.
DOI : 10.1145/1989493.1989502

D. Dice, V. J. Marathe, and N. Shavit, Lock cohorting: a general technique for designing NUMA locks, Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '12, pp.247-256, 2012.

E. W. Dijkstra, Cooperating Sequential Processes, 1965.
DOI : 10.1007/978-1-4757-3472-0_2

G. Drescher, T. Hönig, S. Maier, B. Oechslein, and W. Schröder-preikschat, A Scalability-Aware Kernel Executive for Many-Core Operating Systems, Proceedings of the 1st Workshop on Runtime and Operating Systems for the Many-core Era, WROSME '13, pp.1-10, 2013.
DOI : 10.1007/978-3-642-54420-0_80

J. Eastep, D. Wingate, M. D. Santambrogio, and A. Agarwal, Smartlocks, Proceeding of the 7th international conference on Autonomic computing, ICAC '10, pp.215-224, 2010.
DOI : 10.1145/1809049.1809079

P. Fatourou and N. D. Kallimanis, Sim: a highly-ecient wait-free universal construction

P. Fatourou and N. D. Kallimanis, Revisiting the combining synchronization technique, Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '12, pp.257-266

F. Fich, D. Hendler, and N. Shavit, On the inherent weakness of conditional synchronization primitives, Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing , PODC '04, pp.80-87, 2004.
DOI : 10.1145/1011767.1011780

B. Fitzpatrick, Distributed caching with memcached, Linux Journal, issue.124, p.5, 2004.

B. Ford and J. Lepreau, Evolving mach 3.0 to a migrating thread model, Proceedings of the USENIX Winter 1994 Technical Conference, WTEC'94, pp.9-9, 1994.

M. Fowler, Refactoring: Improving the Design of Existing Code, 1999.
DOI : 10.1007/3-540-45672-4_31

L. Gidra, G. Thomas, J. Sopena, and M. Shapiro, A study of the scalability of stop-theworld garbage collectors on multicores, Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '13, pp.229-240, 2013.

T. Harris and K. Fraser, Language support for lightweight transactions, ACM SIGPLAN Notices, vol.38, issue.11, pp.388-402, 2003.
DOI : 10.1145/949343.949340

T. Harris, M. Herlihy, Y. Lev, Y. Liu, V. Luchangco et al., Towards whatever-scale abstractions for data-driven parallelism, Proceedings of the 1st International Workshop on Rack Scale Computing, p.14, 2014.

A. Hassan, R. Palmieri, and B. Ravindran, Remote Invalidation: Optimizing the Critical Path of Memory Transactions, 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014.
DOI : 10.1109/IPDPS.2014.30

B. He, W. N. Scherer, I. , and M. L. Scott, Time-published queue-based spin locks
DOI : 10.1007/11602569_6
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.113.8532

B. He, W. N. Scherer, I. , and M. L. Scott, Preemption Adaptivity in Time-Published Queue-Based Spin Locks, Proceedings of the 11th International Conference on High Performance Computing, HiPC'05, pp.7-18, 2005.
DOI : 10.1007/11602569_6

D. Hendler, I. Incze, N. Shavit, and M. Tzafrir, Flat combining and the synchronizationparallelism tradeo

D. Hendler, I. Incze, N. Shavit, and M. Tzafrir, Flat combining and the synchronizationparallelism tradeo, Proceedings of the Twenty-second Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA '10, pp.355-364, 2010.

M. Herlihy, V. Luchangco, and M. Moir, Obstruction-free synchronization: double-ended queues as an example, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings., pp.522-529, 2003.
DOI : 10.1109/ICDCS.2003.1203503

M. Herlihy, V. Luchangco, M. Moir, W. N. Scherer, and I. , Software transactional memory for dynamic-sized data structures, Proceedings of the twenty-second annual symposium on Principles of distributed computing , PODC '03, pp.92-101, 2003.
DOI : 10.1145/872035.872048

M. Herlihy and J. E. Moss, Transactional memory: architectural support for lock-free data structures, Proceedings of the 20th Annual International Symposium on Computer Architecture, ISCA '93, pp.289-300, 1993.

M. Herlihy and N. Shavit, The art of multiprocessor programming, Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing , PODC '06, 2008.
DOI : 10.1145/1146381.1146382

M. P. Herlihy, Impossibility and universality results for wait-free synchronization, Proceedings of the seventh annual ACM Symposium on Principles of distributed computing , PODC '88, pp.276-290, 1988.
DOI : 10.1145/62546.62593
URL : http://repository.cmu.edu/cgi/viewcontent.cgi?article=2796&context=compsci

C. A. Hoare, Monitors: an operating system structuring concept, Communications of the ACM, vol.17, issue.10, pp.549-557, 1974.
DOI : 10.1145/355620.361161
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.6394

F. R. Johnson, R. Stoica, A. Ailamaki, and T. C. Mowry, Decoupling contention management from scheduling, Proceedings of the Fifteenth Edition of ASPLOS on Architectural Support for Programming Languages and Operating Systems, ASPLOS XV, pp.117-128, 2010.
DOI : 10.1145/1735970.1736035
URL : http://infoscience.epfl.ch/record/142307

H. Kang and J. L. Wong, To hardware prefetch or not to prefetch?: a virtualized environment study and core binding approach, Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '13, pp.357-368, 2013.

T. Knight, An architecture for mostly functional languages, Proceedings of the 1986 ACM conference on LISP and functional programming , LFP '86, pp.105-112, 1986.
DOI : 10.1145/319838.319854

A. Kogan and E. Petrank, Wait-free queues with multiple enqueuers and dequeuers, Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming, pp.223-234, 2011.
DOI : 10.1145/2038037.1941585
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.222.6484

A. Kogan and E. Petrank, A methodology for creating fast wait-free data structures, Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '12, pp.141-150

D. Koufaty, D. Reddy, and S. Hahn, Bias scheduling in heterogeneous multi-core architectures, Proceedings of the 5th European conference on Computer systems, EuroSys '10, pp.125-138, 2010.
DOI : 10.1145/1755913.1755928

R. Lachaize, B. Lepers, and V. Quéma, Memprof: a memory profiler for NUMA multicore systems, Proceedings of the 2012 USENIX Conference on Annual Technical Conference, USENIX ATC '12, pp.5-5
URL : https://hal.archives-ouvertes.fr/hal-00945731

A. L. Leiner, System Specifications for the DYSEAC, Journal of the ACM, vol.1, issue.2, pp.57-81, 1954.
DOI : 10.1145/320772.320773

S. T. Leutenegger and D. Dias, A modeling study of the TPC-C benchmark, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD '93, pp.22-31, 1993.

T. Liu and E. D. Berger, Sheri: precise detection and automatic mitigation of false sharing, Proceedings of the 2011 ACM International Conference on Object Oriented Programming Systems Languages and Applications, pp.3-18, 2011.

J. Lozi, Le Remote Core Lock (RCL) : une nouvelle technique de verrouillage pour les architectures multi-coeur, 2011.
URL : https://hal.archives-ouvertes.fr/hal-01302676

J. Lozi, PHP bug report #62064, 2012.

J. Lozi, F. David, G. Thomas, J. Lawall, and G. Muller, Remote Core Locking: migrating critical-section execution to improve the performance of multithreaded applications, Proceedings of the 2012 USENIX Annual Technical Conference, USENIX ATC '12, pp.65-76, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00991709

J. Lozi, G. Thomas, J. L. Lawall, and G. Muller, Ecient locking for multicore architectures, 2011.

V. Luchangco, D. Nussbaum, and N. Shavit, A Hierarchical CLH Queue Lock, Proceedings of the 12th International Conference on Parallel Processing, Euro-Par'06, pp.801-810, 2006.
DOI : 10.1007/11823285_84

P. Magnussen, A. Landin, and E. Hagersten, Queue locks on cache coherent multiprocessors, Proceedings of 8th International Parallel Processing Symposium, pp.165-171, 1994.
DOI : 10.1109/IPPS.1994.288305

J. M. Mellor-crummey and M. L. Scott, Algorithms for scalable synchronization on shared-memory multiprocessors, ACM Transactions on Computer Systems, vol.9, issue.1, pp.21-65, 1991.
DOI : 10.1145/103727.103729

J. M. Mellor-crummey and M. L. Scott, Synchronization without contention, Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS IV, pp.269-278, 1991.
DOI : 10.1145/106972.106999
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.17.2001

M. M. Michael and M. L. Scott, Simple, fast, and practical non-blocking and blocking concurrent queue algorithms, Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing , PODC '96, pp.267-275, 1996.
DOI : 10.1145/248052.248106
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.37.3574

E. B. Nightingale, O. Hodson, R. Mcilroy, C. Hawblitzel, and G. Hunt, Helios, Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, SOSP '09, pp.221-234, 2009.
DOI : 10.1145/1629575.1629597

J. K. Ousterhout, Scheduling techniques for concurrent systems, Proceedings of the 3rd International Conference on Distributed Computing Systems, ICDCS'82, pp.22-30, 1982.

Y. Oyama, K. Taura, and A. Yonezawa, Executing parallel programs with synchronization bottlenecks eciently, Proceedings of the International Workshop on Parallel and Distributed Computing for Symbolic and Irregular Applications, PDSIA'99

Y. Padioleau, J. Lawall, R. R. Hansen, and G. Muller, Documenting and automating collateral evolutions in linux device drivers, Proceedings of the 3rd European Conference on Computer Systems 2008, Eurosys '08, pp.247-260, 2008.
URL : https://hal.archives-ouvertes.fr/inria-00123142

L. Papadopoulos, I. Walulya, P. Tsigas, D. Soudris, and B. Barry, Evaluation of message passing synchronization algorithms in embedded systems, 2014 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS XIV), p.14, 2014.
DOI : 10.1109/SAMOS.2014.6893222

M. S. Papamarcos and J. H. Patel, A low-overhead coherence solution for multiprocessors with private cache memories, Proceedings of the 11th Annual International Symposium on Computer Architecture, ISCA '84, pp.348-354, 1984.

D. A. Patterson and J. L. Hennessy, Computer Organization and Design: the Hardware/Software Interface, 2007.

A. Pesterev, N. Zeldovich, and R. T. Morris, Locating cache performance bottlenecks using data profiling, Proceedings of the 5th European conference on Computer systems, EuroSys '10, pp.335-348, 2010.
DOI : 10.1145/1755913.1755947
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.163.9819

D. Petroviê, T. Ropars, and A. Schiper, Leveraging hardware message passing for ecient thread synchronization, Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '14, pp.143-154, 2014.

K. K. Pusukuri, R. Gupta, and L. N. Bhuyan, Lock contention aware thread migrations, Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '14, pp.369-370, 2014.
DOI : 10.1145/2692916.2555273

Z. Radovic and E. Hagersten, Hierarchical backo locks for nonuniform communication architectures, Proceedings of the 9th International Symposium on High-Performance Computer Architecture, HPCA '03, pp.241-253, 2003.

K. S. Ramesh, Design and development of MINIX distributed operating system, Proceedings of the 1988 ACM sixteenth annual conference on Computer science , CSC '88, pp.685-685, 1988.
DOI : 10.1145/322609.323152

C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis, Evaluating MapReduce for Multi-core and Multiprocessor Systems, 2007 IEEE 13th International Symposium on High Performance Computer Architecture, pp.13-24, 2007.
DOI : 10.1109/HPCA.2007.346181

B. R. Rau and J. A. Fisher, Instruction-level parallel processing: History, overview, and perspective, The Journal of Supercomputing, vol.34, issue.1, pp.9-50, 1993.
DOI : 10.1007/BF01205181
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.141.2892

S. Saha and J. Lozi, EHCtor: detecting resource-release omission faults in error-handling code for systems software, CFSE '9, 2013.
URL : https://hal.archives-ouvertes.fr/hal-01302679

S. Saha, J. Lozi, G. Thomas, J. L. Lawall, and G. Muller, HECTOR, Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics, BCB '15, pp.1-12, 2013.
DOI : 10.1145/2808719.2808725
URL : https://hal.archives-ouvertes.fr/hal-00918079

M. L. Scott and W. N. Scherer, Scalable queue-based spin locks with timeout, Proceedings of the Eighth ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, PPoPP '01, pp.44-52, 2001.
DOI : 10.1145/379539.379566
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.551.8520

C. Sharp and G. Morgan, Hugh: A Semantically Aware Universal Construction for Transactional Memory Systems, Proceedings of the 19th International Conference on Parallel Processing, Euro-Par '13, pp.470-481, 2013.
DOI : 10.1007/978-3-642-40047-6_48

N. Shavit and D. Touitou, Software transactional memory, Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing, PODC '95, pp.204-213, 1995.

J. P. Singh, W. Weber, and A. Gupta, SPLASH, ACM SIGARCH Computer Architecture News, vol.20, issue.1, pp.5-44, 1992.
DOI : 10.1145/130823.130824

S. Sridharan, B. Keck, R. Murphy, S. Chandra, and P. Kogge, Thread migration to improve synchronization performance, Proceedings of the 2nd Workshop on Operating System Interference in High Performance Applications, OSIHPA '06, 2006.

S. University, The Phoenix system for MapReduce programming

M. A. Suleman, O. Mutlu, M. K. Qureshi, and Y. N. Patt, Accelerating critical section execution with asymmetric multi-core architectures, Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XIV, pp.253-264, 2009.

J. Talbot, R. M. Yoo, and C. Kozyrakis, Phoenix++, Proceedings of the second international workshop on MapReduce and its applications, MapReduce '11, pp.9-16, 2011.
DOI : 10.1145/1996092.1996095

A. S. Tanenbaum, Distributed operating systems anno 1992. what have we learned so far? Distributed Systems Engineering, pp.3-10, 1993.
DOI : 10.1088/0967-1846/1/1/001

P. The and . Group, PHP: hypertext preprocessor

D. M. Tullsen, S. J. Eggers, and H. M. Levy, Simultaneous multithreading: maximizing onchip parallelism, Proceedings of the 22nd Annual International Symposium on Computer Architecture, ISCA '95, pp.392-403, 1995.
DOI : 10.1109/isca.1995.524578
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.129.1383

D. Vyukov, Combiner/aggregator synchronization primitive. https://software.intelÛ .com/en-us/blogscombineraggregator-synchronization-primitive, 2013.

J. Wamho, S. Diestelhorst, C. Fetzer, P. Marlier, P. Felber et al., Selective core boosting: the return of the turbo button, 2013.

S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, The SPLASH-2 programs: characterization and methodological considerations, Proceedings of the 22nd Annual International Symposium on Computer Architecture, ISCA '95, pp.24-36, 1995.

W. Xiong, S. Park, J. Zhang, Y. Zhou, and Z. Ma, Ad hoc synchronization considered harmful, Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, OSDI '10, pp.1-8, 2010.

R. M. Yoo, A. Romano, and C. Kozyrakis, Phoenix rebirth: Scalable MapReduce on a large-scale shared-memory system, 2009 IEEE International Symposium on Workload Characterization (IISWC), pp.198-207, 2009.
DOI : 10.1109/IISWC.2009.5306783

K. Yotov, K. Pingali, and P. Stodghill, Automatic measurement of memory hierarchy parameters, Proceedings of the 2005 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS '05, pp.181-192, 2005.

J. Zhou and B. Demsky, Memory management for many-core processors with software configurable locality policies, Proceedings of the 2012 International Symposium on Memory Management, ISMM '12, pp.3-14

C. Latencies and .. Of-magnycours-48, 19 (a) Cost of local and remote accesses

.. Rcl, 45 (a) Traditional locks, Critical sections with traditional locks vs, p.45

.. Microbenchmark-results-on-magnycours-48, 64 (a) One shared cache line per CS 64 (b) Five shared cache lines per CS, p.64

.. Microbenchmark-results-on-niagara2-128, 66 (a) One shared cache line per CS 66 (b) Five shared cache lines per CS, p.66

O. Application-performance and D. Berkeley, 73 (a) Magnycours-48: SPLASH-2 and Phoenix 2 73 (b) Magnycours-48 73 (c) Benchmark parameters, Berkeley DB, vol.73, issue.73, pp.2-128

. Server-configurations, D. Berkeley, and T. , 81 (a) Use rate with one lock per hardware thread 81 (b) RCL server configurations, p.81

D. Berkeley and T. , 83 (a) Magnycours-48: Order Status 83 (b) Niagara2-128: Order Status 83 (c) Magnycours-48: Stock Level, pp.83-85

D. Berkeley and S. Tpccoverbkdb, 85 (a) Magnycours-48: Order Status 85 (b) Niagara2-128: Order Status 85 (c) Magnycours-48: Stock Level, pp.85-87

M. Résultats-du, 102 (a) Magnycours-48: une ligne de cache par 102 (b) Magnycours-48: cinq lignes de cache par 102 (c) Niagara2-128: une ligne de cache par, pp.2-128

E. Le-microbenchmark, 104 (a) Temps passé en section critique sur Magnycours-48 104 (b) Temps passé en section critique sur Niagara2-128, p.104

.. Performance-des-diérents-verrous-dans-les-applications and D. Berkeley, 105 (a) Magnycours-48: SPLASH-2 et Phoenix 2 105 (b) Magnycours-48 105 (c) Paramètres des benchmarks 105 (d) Niagara2-128: SPLASH-2 105 (f) Magnycours-48, Berkeley DB, vol.105, issue.105, pp.2-128