Adaptive work stealing with parallelism feedback, PPoPP '07 : Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming, pp.112-120, 2007. ,
STAPL: An Adaptive, Generic Parallel C++ Library, Languages and Compilers for Parallel Computing, pp.195-210, 2003. ,
DOI : 10.1007/3-540-35767-X_13
Thread scheduling for multiprogrammed multiprocessors, Theory Comput. Syst, vol.34, issue.2, pp.115-144, 2001. ,
Range partition adaptors : a mechanism for parallelizing stl, SIGAPP Appl. Comput. Rev, vol.4, issue.1, pp.5-6, 1996. ,
SWARM: A Parallel Programming Framework for Multicore Processors, 2007 IEEE International Parallel and Distributed Processing Symposium, pp.1-8, 2007. ,
DOI : 10.1109/IPDPS.2007.370681
Multi-processing template library, 2006. ,
Tradeoff to minimize extra-computations and stopping criterion tests for parallel iterative schemes, 3rd International Workshop on Parallel Matrix Algorithms and Applications (PMAA04), CIRM, 2004. ,
URL : https://hal.archives-ouvertes.fr/hal-00777293
Online scheduling of parallel programs on heterogeneous systems with applications to cilk, Theory Comput. Syst, vol.35, issue.3, pp.289-304, 2002. ,
The Natural Work-Stealing Algorithm is Stable, SIAM Journal on Computing, vol.32, issue.5, pp.1260-1279, 2003. ,
DOI : 10.1137/S0097539701399551
Adaptive Encoding of Multimedia Streams on MPSoC, ICCS'06 International Conference on Computational Science (4), workshop Real-Time Systems and Adaptive Applications, pp.999-1006, 2006. ,
DOI : 10.1007/11758549_133
Processor-Oblivious Parallel Stream Computations, 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008), pp.72-76, 2008. ,
DOI : 10.1109/PDP.2008.57
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.562.8973
An Introduction to the Theory of Lists, Proceedings of the NATO Advanced Study Institute on Logic of programming and calculi of discrete design, pp.5-42, 1987. ,
DOI : 10.1007/978-3-642-87374-4_1
Cost optimality and predictability of parallel programming with skeletons. Parallel Processing Letters, pp.575-587, 2003. ,
Generic Parallel Programming Using C++ Templates and Skeletons, Springer-Verlag LNCS 3016 Domain- Specific Program Generation, pp.107-126, 2004. ,
DOI : 10.1007/978-3-540-25935-0_7
Scans as primitive parallel operations, IEEE Transactions on Computers, vol.38, issue.11, pp.1526-1538, 1989. ,
DOI : 10.1109/12.42122
An Experimental Analysis of Parallel Sorting Algorithms, Theory of Computing Systems, pp.31135-167, 1998. ,
DOI : 10.1007/s002240000083
Prefix sums and their applications, 1990. ,
Space-Efficient Scheduling of Multithreaded Computations, SIAM Journal on Computing, vol.27, issue.1, pp.202-229, 1998. ,
DOI : 10.1137/S0097539793259471
Methods for partitioning data and to improve parallel execution time for sorting on heterogeneous clusters, LNCS 3947 Springer-Verlag International conference on Grid and Pervasive Computing, pp.175-186, 2006. ,
Accessing hardware performance counters in order to measure the influence of cache on the performance of integer sorting, Proceedings International Parallel and Distributed Processing Symposium, pp.274-275, 2003. ,
DOI : 10.1109/IPDPS.2003.1213491
Adaptive and hybrid algorithms : classification and illustration on triangular system solving, Transgressive Computing TC'2006, pp.131-148, 2006. ,
Adaptive algorithms : theory and application, SIAM PP'06 SIAM Parallel Processing 2006, Mini-Symposium MS1 : Adaptive algorithms for scientific computing, pp.49-50, 2006. ,
Adaptive loops with kaapi on multicore and grid, Proceedings of the 2007 international workshop on Parallel symbolic computation, PASCO '07, 2007. ,
DOI : 10.1145/1278177.1278185
Algorithmes parall??les ?? grain adaptatif et applications, Techniques et sciences informatiques, vol.24, issue.5, pp.1-20, 2005. ,
DOI : 10.3166/tsi.24.505-524
High-speed parallel-prefix VLSI Ling adders, IEEE Transactions on Computers, vol.54, issue.2, pp.225-231, 2005. ,
DOI : 10.1109/TC.2005.26
Self-Adapting Numerical Software for Next Generation Applications, International Journal of High Performance Computing Applications, vol.17, issue.2, 2002. ,
DOI : 10.1177/1094342003017002002
Construction dynamique du graphe de flot de données en Athapascan, RenPar'9, 1997. ,
Adaptive triangular system solving, Dagstuhl Seminar Proceedings ? Challenges in Symbolic Computation Software, 2006. ,
URL : https://hal.archives-ouvertes.fr/hal-00104042
Fast parallel sorting under LogP: experience with the CM-5, IEEE Transactions on Parallel and Distributed Systems, vol.7, issue.8, pp.791-805, 1996. ,
DOI : 10.1109/71.532111
Parallel prefix computation with few processors, Computers & Mathematics with Applications, vol.24, issue.4, pp.77-84, 1992. ,
DOI : 10.1016/0898-1221(92)90009-7
New bounds for parallel prefix circuits, Proceedings of the fifteenth annual ACM symposium on Theory of computing , STOC '83, pp.100-109, 1983. ,
DOI : 10.1145/800061.808738
Some Computer Organizations and Their Effectiveness, IEEE Transactions on Computers, vol.21, issue.9, pp.948-960, 1972. ,
DOI : 10.1109/TC.1972.5009071
The implementation of the cilk-5 multithreaded language, SIGPLAN Conf. PLDI, pp.212-223, 1998. ,
Cacheoblivious algorithms, Proceedings of the 40th IEEE Symposium on Foundations of Computer Science (FOCS 99), pp.285-297, 1999. ,
Athapascan-1: On-line building data flow graph in a parallel language, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192), pp.88-95, 1998. ,
DOI : 10.1109/PACT.1998.727176
Fine Grain Distributed Implementation of a Dataflow Language with Provable Performances, PAPP 2007 4th Int. Workshop on Practical Aspects of High-Level Parallel Programming, China, 2007. ,
DOI : 10.1007/978-3-540-72586-2_87
KAAPI, Proceedings of the 2007 international workshop on Parallel symbolic computation, PASCO '07, pp.15-23, 2007. ,
DOI : 10.1145/1278177.1278182
URL : https://hal.archives-ouvertes.fr/hal-00647474
Systematic efficient parallelization of scan and other list homomorphisms, Proceedings of the European Conference on Parallel Processing, Euro-Par'96, pp.401-408 ,
DOI : 10.1007/BFb0024729
Introduction to Parallel Computing, 2002. ,
Parallel Quicksort using fetch-and-add, IEEE Transactions on Computers, vol.39, issue.1, pp.133-138, 1990. ,
DOI : 10.1109/12.46289
Prefix computations on symmetric multiprocessors, Journal of Parallel and Distributed Computing, vol.61, issue.2, pp.265-278, 2001. ,
An Accumulative Parallel Skeleton for All, European Symposium on Programming, pp.83-97, 2002. ,
DOI : 10.1007/3-540-45927-8_7
Practical in-place merging, Communications of the ACM, vol.31, issue.3, pp.348-352, 1988. ,
DOI : 10.1145/42392.42403
A Checkpoint/Recovery Model for Heterogeneous Dataflow Computations Using Work-Stealing, 2005. ,
DOI : 10.1007/11549468_74
URL : https://hal.archives-ouvertes.fr/hal-00685314
Theft-Induced Checkpointing for Reconfigurable Dataflow Applications, 2005 IEEE International Conference on Electro Information Technology, 2005. ,
DOI : 10.1109/EIT.2005.1626998
URL : https://hal.archives-ouvertes.fr/hal-00683887
Self-Adaptation of Parallel Applications in Heterogeneous and Dynamic Architectures, 2006 2nd International Conference on Information & Communication Technologies, pp.3347-3352, 2006. ,
DOI : 10.1109/ICTTA.2006.1684954
An introduction to parallel algorithms, 1992. ,
Parallelizing Merge Sort onto Distributed Memory Parallel Computers, ISHPC '02 : Proceedings of the 4th International Symposium on High Performance Computing, pp.25-34, 2002. ,
DOI : 10.1007/3-540-47847-7_5
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.546.3471
Programming with the hpc++ parallel standard template library, 1997. ,
HPC++, Proceedings of the 11th international conference on Supercomputing , ICS '97, pp.124-131, 1997. ,
DOI : 10.1145/263580.263614
Un modèle pour l'adaptation dynamique des programmes parallèles, RenPar'16/CFSE'4/SympAAA'2005/Journées Composants, 2005. ,
High-speed parallel-prefix module 2/sup n/-1 adders, IEEE Transactions on Computers, vol.49, issue.7, pp.673-680, 2000. ,
DOI : 10.1109/12.863036
A complexity theory of efficient parallel algorithms, Theoretical Computer Science, vol.71, issue.1, pp.95-132, 1985. ,
DOI : 10.1016/0304-3975(90)90192-K
A segmented parallel-prefix vlsi circuit with small delays for small segments, SPAA '05 : Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures, pp.213-213, 2005. ,
Parallel prefix computation, J. ACM, vol.27, issue.4, pp.831-838, 1980. ,
The Influence of Caches on the Performance of Sorting, SODA : ACM-SIAM Symposium on Discrete Algorithms (A Conference on Theoretical and Experimental Analysis of Discrete Algorithms), 1997. ,
DOI : 10.1006/jagm.1998.0985
A new approach to constructing optimal parallel prefix circuits with small depth, Journal of Parallel and Distributed Computing, vol.64, issue.1, pp.97-107, 2004. ,
DOI : 10.1016/j.jpdc.2003.09.004
Faster optimal parallel prefix circuits: New algorithmic construction, Journal of Parallel and Distributed Computing, vol.65, issue.12, pp.1585-1595, 2005. ,
DOI : 10.1016/j.jpdc.2005.05.017
A parallel prefix convex hill algorithm using maspar, PDPTA '02 : Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, pp.1089-1095, 2002. ,
A library implementation of posix threads under unix, Proceedings of the USENIX Conference, pp.29-41, 1993. ,
Introspective sorting and selection algorithms. Software -Practice and Experience, pp.983-993, 1997. ,
STL tutorial and reference guide, second edition, 2001. ,
Les poly-algorithmes pour une programmation efficace des problèmes numériques . exemple du produit des matrices, 2005. ,
Work-preserving speed-up of parallel matrix computations, SIAM J. Comput, vol.24, issue.3, pp.811-821, 1995. ,
caractérisation et injection de charge à l'usage des machines parallèles, 2008. ,
Intel Threading Building Blocks -Outfitting C++ for Multi-core Processor Parallelism, 2007. ,
Ordonnancement de graphe dynamique de tâches sur architecture de grande taille. Régulation par dégénération séquentielle et distribuée, 2004. ,
The Algorithm Selection Problem, Advances in Computers, vol.15, pp.65-118, 1976. ,
DOI : 10.1016/S0065-2458(08)60520-3
Complexité parallèle et algorithmique pram, Algorithmes Parallèles : Analyse et Conception, pp.105-126, 1994. ,
Complexité parallèle, 1995. available at http ://wwwid.imag.fr/Laboratoire/Membres/Roch_Jean-Louis/perso_html/polycops/polycomplexite-par .pdf. [75] Jean-Louis Roch. Parallel efficient algorithms and their programming, 1997. ,
Ordonnancement de programmes parallèles sur grappes : théorie versus pratique, Actes du Congrès International ALA 2001, pp.131-144, 2001. ,
Un algorithme adaptatif optimal pour le calcul parallèle des préfixes, INRIA, editor, 2006. ,
On-Line Adaptive Parallel Prefix Computation, pp.843-850, 2006. ,
DOI : 10.1007/11823285_88
URL : https://hal.archives-ouvertes.fr/hal-00689026
Parallel computer algebra (tutorial), Proceedings of the 1997 international symposium on Symbolic and algebraic computation , ISSAC '97, 1997. ,
DOI : 10.1145/258726.276957
Super Scalar Sample Sort, 12th Annual European Symposium on Algorithms, pp.14-17, 2004. ,
DOI : 10.1007/978-3-540-30140-0_69
Parallel Prefix (Scan) Algorithms for MPI, Recent Advances in Parallel Virtual Machine and Message Passing Interface, pp.49-57, 2006. ,
DOI : 10.1007/11846802_15
A Comparison of Three Programming Models for Adaptive Applications on the Origin2000, Supercomputing '00 : Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.11, 2000. ,
DOI : 10.1006/jpdc.2001.1777
The GNU libstdc++ parallel mode, Proceedings of the 1st international workshop on Multicore software engineering , IWMSE '08, pp.15-22, 2008. ,
DOI : 10.1145/1370082.1370089
MCSTL: The Multi-core Standard Template Library, Springer-Verlag LNCS 4641, 2007. ,
DOI : 10.1007/978-3-540-74466-5_72
Depth-size trade-offs for parallel prefix computation, Journal of Algorithms, vol.7, issue.2, pp.185-201, 1986. ,
DOI : 10.1016/0196-6774(86)90003-9
Work stealing for time-constrained octree exploration : Application to real-time 3d modeling, In EGPGV, 2007. ,
The C++ Programming Language, 2000. ,
A framework for adaptive algorithm selection in STAPL, Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '05, pp.277-288, 2005. ,
DOI : 10.1145/1065944.1065981
Algorithmes adaptatifs de tri parallèle, RenPar'18 / SympA, 2008. ,
Deque-Free Work-Optimal Parallel STL Algorithms, 2008. ,
DOI : 10.1007/978-3-540-85451-7_95
A simple, fast parallel implementation of Quicksort and its performance evaluation on SUN Enterprise 10000, Eleventh Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2003. Proceedings., pp.372-381, 2003. ,
DOI : 10.1109/EMPDP.2003.1183613
High speed parallel-prefix modulo 2n+1 adders for diminished-one operands, Proceedings of the 15th IEEE Symposium on Computer Arithmetic (ARITH '01), pp.211-217, 2001. ,
Mpi : a standard message passing interface, pp.56-68, 1996. ,
The strict time lower bound and optimal schedules for parallel prefix with resource constraints, IEEE Transactions on Computers, issue.11, pp.451257-1271, 1996. ,
Automated empirical optimizations of software and the atlas project, Parallel Computing, vol.27, issue.12, pp.3-35, 2001. ,
An Adaptive Algorithm Selection Framework for Reduction Parallelization, IEEE Transactions on Parallel and Distributed Systems, vol.17, issue.10, pp.1084-1096, 2006. ,
DOI : 10.1109/TPDS.2006.131
On the construction of zero-deficiency parallel prefix circuits with minimum depth, ACM Transactions on Design Automation of Electronic Systems, vol.11, issue.2, pp.387-409, 2006. ,
DOI : 10.1145/1142155.1142162