C. 8. Conclusion, Y. Perspectives-bibliographie-agrawal, C. E. He, and . Leiserson, Adaptive work stealing with parallelism feedback, PPoPP '07 : Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming, pp.112-120, 2007.

P. An, A. Jula, S. Rus, S. Saunders, T. Smith et al., STAPL: An Adaptive, Generic Parallel C++ Library, Languages and Compilers for Parallel Computing, pp.195-210, 2003.
DOI : 10.1007/3-540-35767-X_13

S. Nimar, R. D. Arora, C. G. Blumofe, and . Plaxton, Thread scheduling for multiprogrammed multiprocessors, Theory Comput. Syst, vol.34, issue.2, pp.115-144, 2001.

H. Matthew, R. A. Austern, A. A. Towle, and . Stepanov, Range partition adaptors : a mechanism for parallelizing stl, SIGAPP Appl. Comput. Rev, vol.4, issue.1, pp.5-6, 1996.

D. A. Bader, V. Kanade, and K. Madduri, SWARM: A Parallel Programming Framework for Multicore Processors, 2007 IEEE International Parallel and Distributed Processing Symposium, pp.1-8, 2007.
DOI : 10.1109/IPDPS.2007.370681

D. Baertschiger, Multi-processing template library, 2006.

O. Beaumont, E. M. Daoudi, N. Maillard, P. Manneback, and J. Roch, Tradeoff to minimize extra-computations and stopping criterion tests for parallel iterative schemes, 3rd International Workshop on Parallel Matrix Algorithms and Applications (PMAA04), CIRM, 2004.
URL : https://hal.archives-ouvertes.fr/hal-00777293

A. Michael, M. O. Bender, and . Rabin, Online scheduling of parallel programs on heterogeneous systems with applications to cilk, Theory Comput. Syst, vol.35, issue.3, pp.289-304, 2002.

P. Berenbrink, T. Friedetzky, and L. A. Goldberg, The Natural Work-Stealing Algorithm is Stable, SIAM Journal on Computing, vol.32, issue.5, pp.1260-1279, 2003.
DOI : 10.1137/S0097539701399551

J. Bernard, J. Roch, P. Serge-de, and M. Santana, Adaptive Encoding of Multimedia Streams on MPSoC, ICCS'06 International Conference on Computational Science (4), workshop Real-Time Systems and Adaptive Applications, pp.999-1006, 2006.
DOI : 10.1007/11758549_133

J. Bernard, J. Roch, and D. Traore, Processor-Oblivious Parallel Stream Computations, 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008), pp.72-76, 2008.
DOI : 10.1109/PDP.2008.57

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.562.8973

R. S. Bird, An Introduction to the Theory of Lists, Proceedings of the NATO Advanced Study Institute on Logic of programming and calculi of discrete design, pp.5-42, 1987.
DOI : 10.1007/978-3-642-87374-4_1

H. Bischof, S. Gorlatch, and E. Kitzelmann, Cost optimality and predictability of parallel programming with skeletons. Parallel Processing Letters, pp.575-587, 2003.

H. Bischof, S. Gorlatch, and R. Leshchinskiy, Generic Parallel Programming Using C++ Templates and Skeletons, Springer-Verlag LNCS 3016 Domain- Specific Program Generation, pp.107-126, 2004.
DOI : 10.1007/978-3-540-25935-0_7

G. E. Blelloch, Scans as primitive parallel operations, IEEE Transactions on Computers, vol.38, issue.11, pp.1526-1538, 1989.
DOI : 10.1109/12.42122

G. E. Blelloch, C. E. Leiserson, B. M. Maggs, C. G. Plaxton, S. J. Smith et al., An Experimental Analysis of Parallel Sorting Algorithms, Theory of Computing Systems, pp.31135-167, 1998.
DOI : 10.1007/s002240000083

E. Guy and . Blelloch, Prefix sums and their applications, 1990.

R. D. Blumofe and C. E. Leiserson, Space-Efficient Scheduling of Multithreaded Computations, SIAM Journal on Computing, vol.27, issue.1, pp.202-229, 1998.
DOI : 10.1137/S0097539793259471

C. Cerin, J. Dubacq, and J. Roch, Methods for partitioning data and to improve parallel execution time for sorting on heterogeneous clusters, LNCS 3947 Springer-Verlag International conference on Grid and Pervasive Computing, pp.175-186, 2006.

C. Cérin, H. Fkaier, and M. Jemni, Accessing hardware performance counters in order to measure the influence of cache on the performance of integer sorting, Proceedings International Parallel and Distributed Processing Symposium, pp.274-275, 2003.
DOI : 10.1109/IPDPS.2003.1213491

. Van-dat, V. Cung, J. Danjean, T. Dumas, G. Gautier et al., Adaptive and hybrid algorithms : classification and illustration on triangular system solving, Transgressive Computing TC'2006, pp.131-148, 2006.

. Van-dat, J. Cung, T. Dumas, G. Gautier, B. Huard et al., Adaptive algorithms : theory and application, SIAM PP'06 SIAM Parallel Processing 2006, Mini-Symposium MS1 : Adaptive algorithms for scientific computing, pp.49-50, 2006.

V. Danjean, R. Gillard, S. Guelton, J. Roch, and T. Roche, Adaptive loops with kaapi on multicore and grid, Proceedings of the 2007 international workshop on Parallel symbolic computation, PASCO '07, 2007.
DOI : 10.1145/1278177.1278185

E. Daoudi, T. Gautier, A. Kerfali, R. Revire, and J. Roch, Algorithmes parall??les ?? grain adaptatif et applications, Techniques et sciences informatiques, vol.24, issue.5, pp.1-20, 2005.
DOI : 10.3166/tsi.24.505-524

G. Dimitrakopoulos, High-speed parallel-prefix VLSI Ling adders, IEEE Transactions on Computers, vol.54, issue.2, pp.225-231, 2005.
DOI : 10.1109/TC.2005.26

J. Dongarra and V. Eijkhout, Self-Adapting Numerical Software for Next Generation Applications, International Journal of High Performance Computing Applications, vol.17, issue.2, 2002.
DOI : 10.1177/1094342003017002002

M. Doreille, F. Galilée, and J. Roch, Construction dynamique du graphe de flot de données en Athapascan, RenPar'9, 1997.

J. Dumas, C. Pernet, and J. Roch, Adaptive triangular system solving, Dagstuhl Seminar Proceedings ? Challenges in Symbolic Computation Software, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00104042

A. C. Dusseau, D. E. Culler, K. E. Schauser, and R. P. Martin, Fast parallel sorting under LogP: experience with the CM-5, IEEE Transactions on Parallel and Distributed Systems, vol.7, issue.8, pp.791-805, 1996.
DOI : 10.1109/71.532111

O. Egecioglu and C. Kaya-koc, Parallel prefix computation with few processors, Computers & Mathematics with Applications, vol.24, issue.4, pp.77-84, 1992.
DOI : 10.1016/0898-1221(92)90009-7

F. E. Fich, New bounds for parallel prefix circuits, Proceedings of the fifteenth annual ACM symposium on Theory of computing , STOC '83, pp.100-109, 1983.
DOI : 10.1145/800061.808738

M. J. Flynn, Some Computer Organizations and Their Effectiveness, IEEE Transactions on Computers, vol.21, issue.9, pp.948-960, 1972.
DOI : 10.1109/TC.1972.5009071

M. Frigo, C. E. Leiserson, and K. H. Randall, The implementation of the cilk-5 multithreaded language, SIGPLAN Conf. PLDI, pp.212-223, 1998.

M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran, Cacheoblivious algorithms, Proceedings of the 40th IEEE Symposium on Foundations of Computer Science (FOCS 99), pp.285-297, 1999.

F. Galilée, J. Roch, G. Cavalheiro, and M. Doreille, Athapascan-1: On-line building data flow graph in a parallel language, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192), pp.88-95, 1998.
DOI : 10.1109/PACT.1998.727176

T. Gautier, J. Roch, and F. Wagner, Fine Grain Distributed Implementation of a Dataflow Language with Provable Performances, PAPP 2007 4th Int. Workshop on Practical Aspects of High-Level Parallel Programming, China, 2007.
DOI : 10.1007/978-3-540-72586-2_87

T. Gautier, X. Besseron, and L. Pigeon, KAAPI, Proceedings of the 2007 international workshop on Parallel symbolic computation, PASCO '07, pp.15-23, 2007.
DOI : 10.1145/1278177.1278182

URL : https://hal.archives-ouvertes.fr/hal-00647474

S. Gorlatch, Systematic efficient parallelization of scan and other list homomorphisms, Proceedings of the European Conference on Parallel Processing, Euro-Par'96, pp.401-408
DOI : 10.1007/BFb0024729

A. Grama, G. Karypis, V. Kumar, and A. Gupta, Introduction to Parallel Computing, 2002.

P. Heidelberger, A. Norton, and J. T. Robinson, Parallel Quicksort using fetch-and-add, IEEE Transactions on Computers, vol.39, issue.1, pp.133-138, 1990.
DOI : 10.1109/12.46289

R. David, J. Helman, and . Jájá, Prefix computations on symmetric multiprocessors, Journal of Parallel and Distributed Computing, vol.61, issue.2, pp.265-278, 2001.

Z. Hu, H. Iwasaki, and M. Takeichi, An Accumulative Parallel Skeleton for All, European Symposium on Programming, pp.83-97, 2002.
DOI : 10.1007/3-540-45927-8_7

B. Huang and M. A. Langston, Practical in-place merging, Communications of the ACM, vol.31, issue.3, pp.348-352, 1988.
DOI : 10.1145/42392.42403

S. Jafar, T. Gautier, A. W. Krings, and J. Roch, A Checkpoint/Recovery Model for Heterogeneous Dataflow Computations Using Work-Stealing, 2005.
DOI : 10.1007/11549468_74

URL : https://hal.archives-ouvertes.fr/hal-00685314

S. Jafar, A. W. Krings, T. Gautier, and J. Roch, Theft-Induced Checkpointing for Reconfigurable Dataflow Applications, 2005 IEEE International Conference on Electro Information Technology, 2005.
DOI : 10.1109/EIT.2005.1626998

URL : https://hal.archives-ouvertes.fr/hal-00683887

S. Jafar, L. Pigeon, T. Gautier, and J. Roch, Self-Adaptation of Parallel Applications in Heterogeneous and Dynamic Architectures, 2006 2nd International Conference on Information & Communication Technologies, pp.3347-3352, 2006.
DOI : 10.1109/ICTTA.2006.1684954

J. Jájá, An introduction to parallel algorithms, 1992.

M. Jeon and D. Kim, Parallelizing Merge Sort onto Distributed Memory Parallel Computers, ISHPC '02 : Proceedings of the 4th International Symposium on High Performance Computing, pp.25-34, 2002.
DOI : 10.1007/3-540-47847-7_5

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.546.3471

E. Johnson and D. Gannon, Programming with the hpc++ parallel standard template library, 1997.

E. Johnson and D. Gannon, HPC++, Proceedings of the 11th international conference on Supercomputing , ICS '97, pp.124-131, 1997.
DOI : 10.1145/263580.263614

B. Jérémy, Un modèle pour l'adaptation dynamique des programmes parallèles, RenPar'16/CFSE'4/SympAAA'2005/Journées Composants, 2005.

L. Kalampoukas, D. Nikolos, C. Efstathiou, H. T. Vergos, and J. Kalamatianos, High-speed parallel-prefix module 2/sup n/-1 adders, IEEE Transactions on Computers, vol.49, issue.7, pp.673-680, 2000.
DOI : 10.1109/12.863036

C. P. Kruskal, L. Rudolph, M. Snir56, C. P. Kruskal, L. R. et al., A complexity theory of efficient parallel algorithms, Theoretical Computer Science, vol.71, issue.1, pp.95-132, 1985.
DOI : 10.1016/0304-3975(90)90192-K

C. Bradley and . Kuszmaul, A segmented parallel-prefix vlsi circuit with small delays for small segments, SPAA '05 : Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures, pp.213-213, 2005.

E. Richard, M. J. Ladner, and . Fischer, Parallel prefix computation, J. ACM, vol.27, issue.4, pp.831-838, 1980.

L. Lamarca, The Influence of Caches on the Performance of Sorting, SODA : ACM-SIAM Symposium on Discrete Algorithms (A Conference on Theoretical and Experimental Analysis of Discrete Algorithms), 1997.
DOI : 10.1006/jagm.1998.0985

Y. Lin and J. Hsiao, A new approach to constructing optimal parallel prefix circuits with small depth, Journal of Parallel and Distributed Computing, vol.64, issue.1, pp.97-107, 2004.
DOI : 10.1016/j.jpdc.2003.09.004

Y. Lin and C. Su, Faster optimal parallel prefix circuits: New algorithmic construction, Journal of Parallel and Distributed Computing, vol.65, issue.12, pp.1585-1595, 2005.
DOI : 10.1016/j.jpdc.2005.05.017

J. Liu, F. Lee, and K. Qian, A parallel prefix convex hill algorithm using maspar, PDPTA '02 : Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, pp.1089-1095, 2002.

F. Mueller, A library implementation of posix threads under unix, Proceedings of the USENIX Conference, pp.29-41, 1993.

R. David and . Musser, Introspective sorting and selection algorithms. Software -Practice and Experience, pp.983-993, 1997.

D. R. Musser, G. J. Derge, and A. Saini, STL tutorial and reference guide, second edition, 2001.

Y. Ngoko, Les poly-algorithmes pour une programmation efficace des problèmes numériques . exemple du produit des matrices, 2005.

Y. Victor, . Pan, P. Franco, and . Preparata, Work-preserving speed-up of parallel matrix computations, SIAM J. Comput, vol.24, issue.3, pp.811-821, 1995.

. Swann-perraneau and . Mesure, caractérisation et injection de charge à l'usage des machines parallèles, 2008.

J. Reinders, Intel Threading Building Blocks -Outfitting C++ for Multi-core Processor Parallelism, 2007.

R. Revire, Ordonnancement de graphe dynamique de tâches sur architecture de grande taille. Régulation par dégénération séquentielle et distribuée, 2004.

J. R. Rice, The Algorithm Selection Problem, Advances in Computers, vol.15, pp.65-118, 1976.
DOI : 10.1016/S0065-2458(08)60520-3

J. Roch, Complexité parallèle et algorithmique pram, Algorithmes Parallèles : Analyse et Conception, pp.105-126, 1994.

J. Roch, Complexité parallèle, 1995. available at http ://wwwid.imag.fr/Laboratoire/Membres/Roch_Jean-Louis/perso_html/polycops/polycomplexite-par .pdf. [75] Jean-Louis Roch. Parallel efficient algorithms and their programming, 1997.

J. Roch, Ordonnancement de programmes parallèles sur grappes : théorie versus pratique, Actes du Congrès International ALA 2001, pp.131-144, 2001.

J. Roch, D. Traore, and C. '. , Un algorithme adaptatif optimal pour le calcul parallèle des préfixes, INRIA, editor, 2006.

J. Roch, D. Traoré, and J. Bernard, On-Line Adaptive Parallel Prefix Computation, pp.843-850, 2006.
DOI : 10.1007/11823285_88

URL : https://hal.archives-ouvertes.fr/hal-00689026

J. Roch and G. Villard, Parallel computer algebra (tutorial), Proceedings of the 1997 international symposium on Symbolic and algebraic computation , ISSAC '97, 1997.
DOI : 10.1145/258726.276957

P. Sanders and S. Winkel, Super Scalar Sample Sort, 12th Annual European Symposium on Algorithms, pp.14-17, 2004.
DOI : 10.1007/978-3-540-30140-0_69

P. Sanders and J. Träff, Parallel Prefix (Scan) Algorithms for MPI, Recent Advances in Parallel Virtual Machine and Message Passing Interface, pp.49-57, 2006.
DOI : 10.1007/11846802_15

H. Shan, J. P. Singh, L. Oliker, and R. Biswas, A Comparison of Three Programming Models for Adaptive Applications on the Origin2000, Supercomputing '00 : Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.11, 2000.
DOI : 10.1006/jpdc.2001.1777

J. Singler and B. Konsik, The GNU libstdc++ parallel mode, Proceedings of the 1st international workshop on Multicore software engineering , IWMSE '08, pp.15-22, 2008.
DOI : 10.1145/1370082.1370089

J. Singler, P. Sanders, and F. Putze, MCSTL: The Multi-core Standard Template Library, Springer-Verlag LNCS 4641, 2007.
DOI : 10.1007/978-3-540-74466-5_72

M. Snir, Depth-size trade-offs for parallel prefix computation, Journal of Algorithms, vol.7, issue.2, pp.185-201, 1986.
DOI : 10.1016/0196-6774(86)90003-9

L. Soares, C. Ménier, B. Raffin, and J. Roch, Work stealing for time-constrained octree exploration : Application to real-time 3d modeling, In EGPGV, 2007.

B. Stroustrup, The C++ Programming Language, 2000.

N. Thomas, G. Tanase, O. Tkachyshyn, J. Perdue, N. M. Amato et al., A framework for adaptive algorithm selection in STAPL, Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '05, pp.277-288, 2005.
DOI : 10.1145/1065944.1065981

D. Traoré, J. Roch, and C. Cérin, Algorithmes adaptatifs de tri parallèle, RenPar'18 / SympA, 2008.

D. Traoré, J. Roch, N. Maillard, T. Gautier, and J. Bernard, Deque-Free Work-Optimal Parallel STL Algorithms, 2008.
DOI : 10.1007/978-3-540-85451-7_95

P. Tsigas and Y. Zhang, A simple, fast parallel implementation of Quicksort and its performance evaluation on SUN Enterprise 10000, Eleventh Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2003. Proceedings., pp.372-381, 2003.
DOI : 10.1109/EMPDP.2003.1183613

H. T. Vergos, D. Nikolos, and C. Efstathiou, High speed parallel-prefix modulo 2n+1 adders for diminished-one operands, Proceedings of the 15th IEEE Symposium on Computer Arithmetic (ARITH '01), pp.211-217, 2001.

W. David, J. J. Walker, and . Dongarra, Mpi : a standard message passing interface, pp.56-68, 1996.

H. Wang, A. Nicolau, K. , and S. Siu, The strict time lower bound and optimal schedules for parallel prefix with resource constraints, IEEE Transactions on Computers, issue.11, pp.451257-1271, 1996.

R. , C. Whaley, A. Petitet, and J. J. Dongarra, Automated empirical optimizations of software and the atlas project, Parallel Computing, vol.27, issue.12, pp.3-35, 2001.

H. Yu and L. Rauchwerger, An Adaptive Algorithm Selection Framework for Reduction Parallelization, IEEE Transactions on Parallel and Distributed Systems, vol.17, issue.10, pp.1084-1096, 2006.
DOI : 10.1109/TPDS.2006.131

H. Zhu, C. Cheng, and R. Graham, On the construction of zero-deficiency parallel prefix circuits with minimum depth, ACM Transactions on Design Automation of Electronic Systems, vol.11, issue.2, pp.387-409, 2006.
DOI : 10.1145/1142155.1142162