PFC: A program to convert Fortran to parallel form, 1982. ,
Automatic translation of FORTRAN programs to vector form, ACM Transactions on Programming Languages and Systems, vol.9, issue.4, pp.491-542, 1987. ,
DOI : 10.1145/29873.29875
Unified form language, ACM Transactions on Mathematical Software, vol.40, issue.2, p.9, 2014. ,
DOI : 10.1145/2566630
The landscape of parallel computing research: A view from Berkeley, 2006. ,
The potential of synergistic static, dynamic and speculative loop nest optimizations for automatic parallelization, Workshop on Parallel Execution of Sequential Programs on Multicore Architectures (PESPMA'10), 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00494305
PENCIL: A Platform-Neutral Compute Intermediate Language for Accelerator Programming, 2015 International Conference on Parallel Architecture and Compilation (PACT), 2015. ,
DOI : 10.1109/PACT.2015.17
URL : https://hal.archives-ouvertes.fr/hal-01257236
PENCIL Language Specification, URL, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01154812
Data dependence in ordinary programs, 1976. ,
Dependence Analysis for Supercomputing, 1988. ,
DOI : 10.1007/978-1-4684-6894-6
Automatic C-to-CUDA Code Generation for Affine Programs, Proceedings of the 19th Joint European Conference on Theory and Practice of Software, International Conference on Compiler Construction, pp.244-263, 2010. ,
DOI : 10.1007/978-3-642-11970-5_14
Code generation in the polyhedral model is easier than you think, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004., pp.7-16, 2004. ,
DOI : 10.1109/PACT.2004.1342537
URL : https://hal.archives-ouvertes.fr/hal-00017260
Improving Data Locality by Chunking, Proceedings of the 12th International Conference on Compiler Construction, CC'03, pp.320-334, 2003. ,
DOI : 10.1007/3-540-36579-6_23
URL : https://hal.archives-ouvertes.fr/inria-00001055
VOBLA: A vehicle for optimized basic linear algebra, LCTES, pp.115-124, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01508181
The Polyhedral Model Is More Widely Applicable Than You Think, Proceedings of the 19th Joint European Conference on Theory and Practice of Software, International Conference on Compiler Construction, pp.283-303, 2010. ,
DOI : 10.1007/978-3-642-11970-5_16
URL : https://hal.archives-ouvertes.fr/inria-00551087
Analysis of programs for parallel processing Electronic Computers, IEEE Transactions, issue.5, pp.15757-763, 1966. ,
Polaris: The next generation in parallelizing compilers, Proceedings of the Seventh Workshop on Languages and Compilers for Parallel Computing, pp.141-154, 1994. ,
Compiling affine loop nests for distributed-memory parallel architectures, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on, SC '13, pp.1-12, 2013. ,
DOI : 10.1145/2503210.2503289
A practical automatic polyhedral parallelizer and locality optimizer, Proceedings of the 2008 ACM SIG- PLAN conference on Programming language design and implementation, pp.101-113, 2008. ,
A model for fusion and code motion in an automatic parallelizing compiler, Proceedings of the 19th international conference on Parallel architectures and compilation techniques, PACT '10, pp.343-352, 2010. ,
DOI : 10.1145/1854273.1854317
Register Allocation: What Does the NP-Completeness Proof of Chaitin et al. Really Prove? Or Revisiting Register Allocation: Why and How, LCPC'06, 2006. ,
DOI : 10.1007/978-3-540-72521-3_21
Improving register allocation for subscripted variables, Symp. on Programming Language Design and Implementation (PLDI'90), 1990. ,
A domain-specific approach to heterogeneous parallelism, PPoPP, pp.35-46, 2011. ,
Register allocation via coloring, Computer Languages, vol.6, issue.1, pp.47-57, 1981. ,
DOI : 10.1016/0096-0551(81)90048-5
Rodinia: A benchmark suite for heterogeneous computing, 2009 IEEE International Symposium on Workload Characterization (IISWC), pp.44-54, 2009. ,
DOI : 10.1109/IISWC.2009.5306797
Diderot: A parallel DSL for image analysis and visualization, PLDI, pp.111-120, 2012. ,
OpenCL math library, 2013. URL https ,
Processor virtualization and split compilation for heterogeneous multicore embedded systems, Proceedings of the 47th Design Automation Conference on, DAC '10, 2010. ,
DOI : 10.1145/1837274.1837303
URL : https://hal.archives-ouvertes.fr/inria-00472274
Cray standard c/c++ reference manual, 2012. ,
The Scalable Heterogeneous Computing (SHOC) benchmark suite, Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, GPGPU '10, pp.63-74, 2010. ,
DOI : 10.1145/1735688.1735702
New Complexity Results on Array Contraction and Related Problems, Journal of VLSI signal processing systems for signal, image and video technology, vol.24, issue.3/4, pp.35-55, 2005. ,
DOI : 10.1007/s11265-005-4937-3
Scheduling and Automatic Parallelization, 2000. ,
DOI : 10.1007/978-1-4612-1362-8
URL : https://hal.archives-ouvertes.fr/hal-00856645
Experience in the automatic parallelization of four Perfect-Benchmark programs, 1992. ,
DOI : 10.1007/BFb0038658
Array expansion, Proceedings of the 2nd international conference on Supercomputing, pp.429-441, 1988. ,
DOI : 10.1145/2591635.2667159
URL : https://hal.archives-ouvertes.fr/hal-01099746
Dataflow analysis of array and scalar references, International Journal of Parallel Programming, vol.24, issue.4, pp.23-53, 1991. ,
DOI : 10.1007/BF01407931
Some efficient solutions to the affine scheduling problem. I. One-dimensional time, International Journal of Parallel Programming, vol.40, issue.6, pp.313-347, 1992. ,
DOI : 10.1007/BF01407835
Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time, International Journal of Parallel Programming, vol.2, issue.4, pp.389-420, 1992. ,
DOI : 10.1007/BF01379404
Scalable and Structured Scheduling, International Journal of Parallel Programming, vol.28, issue.6, pp.459-487, 2006. ,
DOI : 10.1007/s10766-006-0011-4
Generation of synchronous code for automatic parallelization of while loops, EURO-PAR'95 Parallel Processing, pp.313-326, 1995. ,
DOI : 10.1007/BFb0020474
Polly ? performing polyhedral optimizations on a low-level intermediate representation. Parallel Processing Letters ,
Hybrid Hexagonal/Classical Tiling for GPUs, Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO '14, pp.66-66, 2014. ,
DOI : 10.1145/2581122.2544160
URL : https://hal.archives-ouvertes.fr/hal-00911177
Polyhedral AST Generation Is More Than Scanning Polyhedra, ACM Transactions on Programming Languages and Systems, vol.37, issue.4, 2015. ,
DOI : 10.1145/2743016
URL : https://hal.archives-ouvertes.fr/hal-01257239
On privatization of variables for data-parallel execution, Proceedings 11th International Parallel Processing Symposium, pp.533-541, 1997. ,
DOI : 10.1109/IPPS.1997.580952
Register Allocation for Programs in SSA-Form, CC'06, pp.247-262, 2006. ,
DOI : 10.1007/11688839_20
SPEC CPU2000: measuring CPU performance in the New Millennium, Computer, vol.33, issue.7, pp.28-35, 2000. ,
DOI : 10.1109/2.869367
High-performance code generation for stencil computations on GPU architectures, Proceedings of the 26th ACM international conference on Supercomputing, ICS '12, pp.311-320 ,
DOI : 10.1145/2304576.2304619
Supernode partitioning, Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '88, pp.319-328, 1988. ,
DOI : 10.1145/73560.73588
Dynamic and Speculative Polyhedral Parallelization Using Compiler-Generated Skeletons, International Journal of Parallel Programming, vol.30, issue.3, pp.529-545, 2014. ,
DOI : 10.1007/s10766-013-0259-4
URL : https://hal.archives-ouvertes.fr/hal-00825738
Code generation for multiple mappings, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation, p.332, 1995. ,
DOI : 10.1109/FMPC.1995.380437
Optimizing compilers for modern architectures: a dependence-based approach, 2002. ,
Opencl 1.2 specification, 2011. ,
Optimal code motion: theory and practice, ACM Transactions on Programming Languages and Systems, vol.16, issue.4, pp.1117-1155, 1994. ,
DOI : 10.1145/183432.183443
Undecidability of static analysis, ACM Letters on Programming Languages and Systems, vol.1, issue.4, pp.323-337, 1992. ,
DOI : 10.1145/161494.161501
LLVM: A compilation framework for lifelong program analysis & transformation, International Symposium on Code Generation and Optimization, 2004. CGO 2004., pp.75-88, 2004. ,
DOI : 10.1109/CGO.2004.1281665
The OpenCL specification 2.0, 2015. ,
Automatic storage management for parallel programs, Parallel Computing, vol.24, issue.3-4, pp.649-671, 1998. ,
DOI : 10.1016/S0167-8191(98)00029-5
An industrial perspective: A pragmatic high end signal processing design environment at Thales, SAMOS, pp.52-57, 2003. ,
Array privatization for parallel execution of loops, Proceedings of the 6th international conference on Supercomputing, pp.313-322, 1992. ,
Maximizing parallelism and minimizing synchronization with affine partitions, Parallel Computing, vol.24, issue.3-4, pp.3-4445, 1998. ,
DOI : 10.1016/S0167-8191(98)00021-0
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.7731
An affine partitioning algorithm to maximize parallelism and minimize communication, Proceedings of the 13th international conference on Supercomputing , ICS '99, pp.228-237, 1999. ,
DOI : 10.1145/305138.305197
High performance Fortran. Parallel & Distributed Technology: Systems & Applications, pp.25-42, 1993. ,
Oolala: An object oriented analysis and design of numerical linear algebra, OOPSLA, pp.229-252, 2000. ,
Array-data flow analysis and its use in array privatization, Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '93, pp.2-15, 1993. ,
DOI : 10.1145/158511.158515
Improving compiler scalability: optimizing large programs at small price, Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp.143-152, 2015. ,
DOI : 10.1145/2813885.2737954
Rstream compiler, Encyclopedia of Parallel Computing, pp.1756-1765, 2011. ,
Automatic Parallelization: An Overview of Fundamental Compiler Techniques, Synthesis Lectures on Computer Architecture, vol.7, issue.1, 2012. ,
DOI : 10.2200/S00340ED1V01Y201201CAC019
A Machine Learning Approach to Automatic Production of Compiler Heuristics, Proceedings of the 10th International Conference on Artificial Intelligence: Methodology, Systems, and Applications, AIMSA '02, pp.41-50, 2002. ,
DOI : 10.1007/3-540-46148-5_5
Advanced compiler design and implementation, 1997. ,
Polymage: Automatic optimization for image processing pipelines, Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '15, pp.429-443 ,
Nvidia CUDA programming guide 4, 2011. ,
A 222pn upper bound on the complexity of Presburger Arithmetic, Journal of Computer and System Sciences, vol.16, issue.3, pp.323-332, 1978. ,
DOI : 10.1016/0022-0000(78)90021-1
GRAPHITE: polyhedral analyses and optimizations for GCC, proceedings of the 2006 GCC developers summit, p.2006, 2006. ,
The Omega test: a fast and practical integer programming algorithm for dependence analysis, Proceedings of the 1991 ACM/IEEE conference on Supercomputing , Supercomputing '91, pp.4-13, 1991. ,
DOI : 10.1145/125826.125848
The Omega test: a fast and practical integer programming algorithm for dependence analysis, Proceedings of the 1991 ACM/IEEE conference on Supercomputing , Supercomputing '91, pp.4-13, 1991. ,
DOI : 10.1145/125826.125848
Optimizing memory usage in the polyhedral model, ACM Trans. on Programming Languages and Systems, vol.22, issue.5, pp.773-815, 2000. ,
Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines, PLDI, pp.519-530, 2013. ,
The undecidability of aliasing, ACM Transactions on Programming Languages and Systems, vol.16, issue.5, pp.1467-1471, 1994. ,
DOI : 10.1145/186025.186041
Tiling multidimensional iteration spaces for nonshared memory machines, Proceedings of the 1991 ACM/IEEE conference on Supercomputing , Supercomputing '91, pp.111-120, 1991. ,
DOI : 10.1145/125826.125893
Lightweight modular staging: A pragmatic approach to runtime code generation and compiled dsls, Proceedings of the Ninth International Conference on Generative Programming and Component Engineering, GPCE '10, pp.127-136, 2010. ,
Presburger's article on integer airthmetic: Remarks and translation, pp.84-639, 1984. ,
Meta optimization, ACM SIGPLAN Notices, vol.38, issue.5, pp.77-90, 2003. ,
DOI : 10.1145/780822.781141
Automating the construction of compiler heuristics using machine learning, 2006. ,
OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems, Computing in Science & Engineering, vol.12, issue.3, pp.66-73, 2010. ,
DOI : 10.1109/MCSE.2010.69
Speculative Program Parallelization with Scalable and Decentralized Runtime Verification, Runtime Verification, pp.124-139, 2014. ,
DOI : 10.1007/978-3-319-11164-3_11
URL : https://hal.archives-ouvertes.fr/hal-01070610
A unified framework for schedule and storage optimization, Proc. of the 2001 PLDI Conf, 2001. ,
URL : https://hal.archives-ouvertes.fr/hal-00808285
GRAPHITE two years after: First lessons learned from Real-World polyhedral compilation, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00551516
Elimination of Memory-Based dependences for Loop-Nest optimization and parallelization, 3rd GCC Research Opportunities Workshop, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00992740
Automatic array privatization, Languages and Compilers for Parallel Computing, pp.500-521, 1994. ,
DOI : 10.1007/3-540-57659-2_29
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.3.5746
Data dependence in ordinary programs, 1976. ,
Scalable Program Optimization Techniques in the Polyhedral Model, 2007. ,
Violated dependence analysis, Proceedings of the 20th annual international conference on Supercomputing , ICS '06, pp.335-344, 2006. ,
DOI : 10.1145/1183401.1183448
URL : https://hal.archives-ouvertes.fr/hal-01257290
isl: An Integer Set Library for the Polyhedral Model, Mathematical Software - ICMS 2010, pp.299-302, 2010. ,
DOI : 10.1007/978-3-642-15582-6_49
Pencil support in pet and PPCG, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01133962
Polyhedral extraction tool, IMPACT, 2012. ,
Polyhedral parallel code generation for CUDA, ACM Transactions on Architecture and Code Optimization, vol.9, issue.4, 2013. ,
DOI : 10.1145/2400682.2400713
URL : https://hal.archives-ouvertes.fr/hal-00786677
The definitive guide to GCC, 2006. ,
DOI : 10.1007/978-1-4302-0219-6
Extended Backus-Naur Form Syntaxt Specification, 1996. ,
A loop transformation theory and an algorithm to maximize parallelism, IEEE Transactions on Parallel and Distributed Systems, vol.2, issue.4, pp.452-471, 1991. ,
DOI : 10.1109/71.97902
Iteration space tiling for memory hierarchies, Proceedings of the Third SIAM Conference on Parallel Processing for Scientific Computing Society for Industrial and Applied Mathematics, pp.357-361, 1989. ,
Omega Calculator, Encyclopedia of Parallel Computing, pp.978-978, 2011. ,
DOI : 10.1007/978-0-387-09766-4_2303
PENCIL: A Platform-Neutral Compute Intermediate Language for Accelerator Programming, 2015 International Conference on Parallel Architecture and Compilation (PACT) ,
DOI : 10.1109/PACT.2015.17
URL : https://hal.archives-ouvertes.fr/hal-01257236
PENCIL Language Specification, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01154812
VOBLA: A Vehicle for Optimized Basic Linear Algebra, p.14, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01508181
Improved loop tiling based on the removal of spurious false dependences, ACM Transactions on Architecture and Code Optimization, vol.9, issue.4, p.2013 ,
DOI : 10.1145/2400682.2400711
URL : https://hal.archives-ouvertes.fr/hal-00786674