67 III.2 Spécificités du langage CUDA et intégration à l'analyseur syntaxique, p.68 ,
89 III.5.1 Les leviers d'optimisations ,
97 IV.2.1 Description du problème cible, Sommaire IV.1 La problématique : gérer un noeud de calcul multiGPU . 96 IV.2 Contrôle des 99 IV.2.4 Les deux implémentations . . . . . . . . . . . . . . . . . . . . 99 ,
GPU -2 processus, : accélération de 1,51 ? OpenMP+cudaHostAlloc, 4 GPU -4 threads ,
Extending OpenMP to Survive the Heterogeneous Multi-Core Era, International Journal of Parallel Programming, vol.41, issue.1, pp.440-459, 2010. ,
DOI : 10.1007/s10766-010-0135-4
PIPS : a Workbench for Program Parallelization and Optimization, European Parallel Tool Meeting (EPTM), 1996. ,
The Polyhedral Model Is More Widely Applicable Than You Think, Compiler Construction, pp.283-303, 2010. ,
DOI : 10.1007/978-3-642-11970-5_16
URL : https://hal.archives-ouvertes.fr/inria-00551087
OpenMP for accelerators, OpenMP in the Petascale Era, pp.108-121, 2011. ,
Programming languages for distributed computing systems, ACM Computing Surveys, vol.21, issue.3, pp.261-322, 1989. ,
DOI : 10.1145/72551.72552
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.145.7873
Using OpenMP : Portable Shared Memory Parallel Programming, 2007. ,
CSX700 Floating Point Processor Datasheet, 2008. ,
Enabling Low-Overhead Hybrid MPI/OpenMP Parallelism with MPC, Mitsuhisa Sato ,
DOI : 10.1007/978-3-642-13217-9_1
Beyond Loop Level Parallelism in OpenMP : Accelerators , Tasking and More, Lecture Notes in Computer Science, vol.6132, pp.1-14, 2012. ,
HMPP : A Hybrid Multi-core Parallel Programming Environment, Proceedings of GPGPU, First Workshop on General Purpose Processing on Graphics Processing Units, 2007. ,
LINPACK Benchmark, pp.803-820, 2003. ,
DOI : 10.1007/978-0-387-09766-4_155
A portable C compiler for OpenMP V.2.0, EWOMP 2003, 2003. ,
LINPACK : Users' Guide. Number 8, Society for Industrial Mathematics, 1979. ,
A survey of parallel computer architectures, Computer, vol.23, issue.2, pp.5-16, 1990. ,
The Green500 List: Encouraging Sustainable Supercomputing, Computer, vol.40, issue.12, pp.50-55, 2007. ,
DOI : 10.1109/MC.2007.445
Very high-speed computing systems, Proceedings of the IEEE, vol.54, issue.12, pp.1901-1909, 1966. ,
Some computer organizations and their effectiveness. Computers, IEEE Transactions, issue.219, pp.948-960, 1972. ,
Parallel architectures, ACM Comput. Surv, vol.28, issue.1, pp.67-70, 1996. ,
The Green500 List: Year one, 2009 IEEE International Symposium on Parallel & Distributed Processing, pp.1-7, 2009. ,
DOI : 10.1109/IPDPS.2009.5160978
Overview and C++ AMP approach, 2011. ,
Completing an MIMD multiprocessor taxonomy, ACM SIGARCH Computer Architecture News, vol.16, issue.3, pp.44-47, 1988. ,
A free OpenMP compiler and run-time library infrastructure for research on shared memory parallel computing, Proceedings of the 16th IASTED International Conference on Parallel and Distributed Computing and Systems, pp.354-361, 2004. ,
Introduction to the Cell multiprocessor, IBM journal of Research and Development, vol.49, issue.45, pp.589-604, 2005. ,
Programming Massively Parallel Processors -A Hands-on Approach Chris Lattner and Vikram Adve. LLVM : A Compilation Framework for Lifelong Program Analysis & Transformation, Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO´04CGO´ CGO´04), 2004. ,
OpenMP to GPGPU : a compiler framework for automatic translation and optimization, PPoPP '09 : Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, pp.101-110, 2009. ,
A ROSE-Based OpenMP 3.0 Research Compiler Supporting Multiple Runtime Libraries, Mitsuhisa Sato ,
DOI : 10.1007/978-3-642-13217-9_2
Beyond Loop Level Parallelism in OpenMP : Accelerators, Tasking and More, Lecture Notes in Computer Science, vol.6132, pp.15-28, 2010. ,
Real-time robot motion planning using rasterizing computer graphics hardware, 1990. ,
Intel?? many integrated core (MIC) architecture, Proceedings of the 2012 workshop on High-Performance Computing for Astronomy Date, Astro-HPC '12, pp.5-6, 2005. ,
DOI : 10.1145/2286976.2286979
Cramming More Components Onto Integrated Circuits, Proceedings of the IEEE, vol.86, issue.1, 1965. ,
DOI : 10.1109/JPROC.1998.658762
Progress in digital integrated electronics, Electron Devices Meeting, pp.11-13, 1975. ,
Introducing the Graph 500, 2010. ,
The GPU Computing Era, IEEE Micro, vol.30, issue.2, pp.56-69, 2010. ,
DOI : 10.1109/MM.2010.41
Source-to-Source Code Translator: OpenMP C to CUDA, 2011 IEEE International Conference on High Performance Computing and Communications, pp.512-519, 2011. ,
DOI : 10.1109/HPCC.2011.73
MultiGPU computing using MPI or OpenMP, Proceedings of the 2010 IEEE 6th International Conference on Intelligent Computer Communication and Processing, pp.347-354, 2010. ,
DOI : 10.1109/ICCP.2010.5606414
CUBLAS Library Version 5.0. NVIDIA Corporation ,
CUDA Best Practices Guide Version 5.0. NVIDIA Corporation ,
CUDA Programming Guide Version 5.0. NVIDIA Corporation ,
Fermi Architecture Whitepaper. NVIDIA Corpo- ration ,
Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, 2010. ,
DOI : 10.1109/SC.2010.14
URL : https://hal.archives-ouvertes.fr/inria-00551067
PGI Fortran & C Accelerator Programming Model, 2008. ,
Parallel Programming in C with MPI and OpenMP, 2003. ,
HMPP Workbench -Build manycore applications, GPU Accelerators and Hybrid Computing Workshop, 2009. ,
Larrabee : a many-core x86 architecture for visual computing ,
Openmp compiler for a software distributed shared memory system scash, 2000. ,
Auto-tuning of level 1 and level 2 BLAS for GPUs, Concurrency and Computation : Practice and Experience, 2012. ,
Models and languages for parallel computation, ACM Computing Surveys, vol.30, issue.2, pp.123-169, 1998. ,
DOI : 10.1145/280277.280278
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.28.1801
Better performance at lower occupancy, Proceedings of the GPU Technology Conference, 2010. ,
Compilers and More : A GPU and Accelerator Programming Model, 2008. ,
The duality of memory and communication in the implementation of a multiprocessor operating system, SIGOPS Oper. Syst. Rev, vol.21, issue.5, pp.63-76, 1987. ,