Automatic program transformations for virtual memory computers, Proceedings of the 1979 National Computer Conference, pp.969-969, 1979. ,
Bee+Cl@k, ACM SIGPLAN Notices, vol.42, issue.7, pp.73-82, 2007. ,
DOI : 10.1145/1273444.1254778
A catalogue of optimizing transformations, 1971. ,
Automatic translation of FORTRAN programs to vector form, ACM Transactions on Programming Languages and Systems, vol.9, issue.4, pp.491-542, 1987. ,
DOI : 10.1145/29873.29875
An overview of the suif compiler for scalable parallel machines, PPSC, pp.662-667, 1995. ,
PIPS is not (just) polyhedral software, 1st International Workshop on Polyhedral Compilation Techniques (IMPACT), 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00744312
al. Par4All: From convex array regions to heterogeneous computing, IMPACT, 2012. ,
Scanning polyhedra with DO loops, ACM SIGPLAN Notices, vol.26, issue.7, pp.39-50, 1991. ,
DOI : 10.1145/109626.109631
URL : https://hal.archives-ouvertes.fr/hal-00752774
The Parma Polyhedra Library: Toward a complete set of numerical abstractions for the analysis and verification of hardware and software systems, Science of Computer Programming, vol.72, issue.1-2, pp.3-21, 0193. ,
DOI : 10.1016/j.scico.2007.08.001
Tiling stencil computations to maximize parallelism, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis, 2012. ,
DOI : 10.1109/SC.2012.107
Data dependence in ordinary programs, 1976. ,
Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors, ACM SIGPLAN Notices, vol.44, issue.4, pp.219-228, 2009. ,
DOI : 10.1145/1594835.1504209
Automatic c-to-cuda code generation for affine programs, Compiler Construction, pp.244-263, 2010. ,
Efficient code generation for automatic parallelization and optimization, Second International Symposium on Parallel and Distributed Computing, 2003. Proceedings., pp.23-30, 2003. ,
DOI : 10.1109/ISPDC.2003.1267639
Code generation in the polyhedral model is easier than you think, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004., pp.7-16, 2004. ,
DOI : 10.1109/PACT.2004.1342537
URL : https://hal.archives-ouvertes.fr/hal-00017260
Improving Data Locality in Static Control Programs, 2004. ,
Clan -a polyhedral representation extractor for high level programs, 2008. ,
FADAlib: an open source C++ library for fuzzy array dataflow analysis, Procedia Computer Science, vol.1, issue.1, pp.2075-2084, 2010. ,
DOI : 10.1016/j.procs.2010.04.232
URL : https://hal.archives-ouvertes.fr/hal-00551673
The Polyhedral Model Is More Widely Applicable Than You Think, Compiler Construction, pp.283-303, 2010. ,
DOI : 10.1007/978-3-642-11970-5_16
URL : https://hal.archives-ouvertes.fr/inria-00551087
Julia: A fast dynamic language for technical computing, 2012. ,
Julia: A fast dynamic language for technical computing, 1209. ,
Polyglot: a polyhedral loop transformation framework for a graphical dataflow language, Compiler Construction, pp.123-143, 2013. ,
al. Parallel programming with polaris, Computer, issue.12, pp.2978-82, 1996. ,
A practical automatic polyhedral parallelizer and locality optimizer, ACM SIGPLAN Notices, vol.43, issue.6, pp.101-113, 2008. ,
DOI : 10.1145/1379022.1375595
A model for fusion and code motion in an automatic parallelizing compiler, Proceedings of the 19th international conference on Parallel architectures and compilation techniques, PACT '10, pp.343-352, 2010. ,
DOI : 10.1145/1854273.1854317
Effective Automatic Parallelization and Locality Optimization Using the Polyhedral Model, p.3325799, 2008. ,
Custom memory management methodology: Exploration of memory organisation for embedded multimedia system design, 1998. ,
DOI : 10.1007/978-1-4757-2849-1
Polyhedra scanning revisited, Conference on Programming Language Design and Implementation, pp.499-508 ,
DOI : 10.1145/2254064.2254123
A framework for composing high-level loop transformations, 2008. ,
PATUS: A Code Generation and Autotuning Framework for Parallel Iterative Stencil Computations on Modern Microarchitectures, 2011 IEEE International Parallel & Distributed Processing Symposium, 2011. ,
DOI : 10.1109/IPDPS.2011.70
Generating and auto-tuning parallel stencil codes, 2011. ,
Recovering logical data and code structures, 1995. ,
Facilitating the search [45] NVIDIA Corporation, 2013. ,
Interprocedural array region analyses, International Journal of Parallel Programming, vol.24, 1996. ,
URL : https://hal.archives-ouvertes.fr/hal-00752611
Loop parallelization algorithms In Compiler Optimizations for Scalable Parallel Systems: Languages, Compilation Techniques and Run Time Systems, LNCS, vol.1808, pp.141-171, 2001. ,
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures, 2008 SC, International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-4, 2008. ,
DOI : 10.1109/SC.2008.5222004
Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors, SIAM Review, vol.51, issue.1, pp.129-159, 2009. ,
DOI : 10.1137/070693199
Cetus: A Source-to-Source Compiler Infrastructure for Multicores, Computer, vol.42, issue.12, pp.4236-4278, 2009. ,
DOI : 10.1109/MC.2009.385
Model-Driven Tile Size Selection for DOACROSS Loops on GPUs, Proceedings of the 17th international conference on Parallel processing -Volume Part II, Euro-Par'11, pp.401-412, 2011. ,
DOI : 10.1007/978-3-642-23397-5_40
Array expansion, 2nd International Conference on Supercomputing (ICS'88), pp.429-441, 1988. ,
URL : https://hal.archives-ouvertes.fr/hal-01099746
Parametric integer programming, RAIRO - Operations Research, vol.22, issue.3, pp.243-268, 1988. ,
DOI : 10.1051/ro/1988220302431
Dataflow analysis of array and scalar references, International Journal of Parallel Programming, vol.24, issue.4, pp.23-53, 1991. ,
DOI : 10.1007/BF01407931
Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time, International Journal of Parallel Programming, vol.2, issue.4, pp.389-420, 1992. ,
DOI : 10.1007/BF01379404
Some efficient solutions to the affine scheduling problem. I. One-dimensional time, International Journal of Parallel Programming, vol.40, issue.6, pp.313-348, 1992. ,
DOI : 10.1007/BF01407835
Parallelization of loop nests with general bounds in the polyhedron model, 1997. ,
Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies, International Journal of Parallel Programming, vol.20, issue.1, pp.261-317, 2006. ,
DOI : 10.1007/s10766-006-0012-3
URL : https://hal.archives-ouvertes.fr/hal-01257288
An efficient code generation technique for tiled iteration spaces. Parallel and Distributed Systems, IEEE Transactions on, vol.14, issue.10, pp.1021-1034, 2003. ,
The loop parallelizer loopo, Proc. Sixth Workshop on Compilers for Parallel Computers, pp.311-320, 1996. ,
On index set splitting, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425), pp.607-631, 2000. ,
DOI : 10.1109/PACT.1999.807572
The openacc application programming interface, 2011. ,
Eigen: a c++ linear algebra library, 2011. ,
PADS: A Pattern-Driven Stencil Compiler-Based Tool for Reuse of Optimizations on GPGPUs, 2011 IEEE 17th International Conference on Parallel and Distributed Systems, pp.308-315, 2011. ,
DOI : 10.1109/ICPADS.2011.94
Dyntile: Parametric tiled loop generation for parallel execution on multicore processors A stencil compiler for short-vector SIMD architectures, Parallel & Distributed Processing (IPDPS) IEEE International Symposium on International Conference on Supercomputing (ICS), pp.1-12, 2010. ,
High-performance code generation for stencil computations on GPU architectures, Proceedings of the 26th ACM international conference on Supercomputing, ICS '12, 2012. ,
DOI : 10.1145/2304576.2304619
A Revised Approach to Ice Microphysical Processes for the Bulk Parameterization of Clouds and Precipitation, Monthly Weather Review, vol.132, issue.1, 2004. ,
DOI : 10.1175/1520-0493(2004)132<0103:ARATIM>2.0.CO;2
Cart: Constant aspect ratio tiling, Proceedings of the 4th International Workshop on Polyhedral Compilation Techniques, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-00915827
Semantical interprocedural parallelization: An overview of the pips project, Proceedings of the 5th international conference on Supercomputing, pp.244-251, 1991. ,
URL : https://hal.archives-ouvertes.fr/hal-00984684
Register tiling in nonrectangular iteration spaces, ACM Transactions on Programming Languages and Systems, vol.24, issue.4, pp.409-453, 2002. ,
DOI : 10.1145/567097.567101
New user interface for petit and other interfaces: user guide, 1995. ,
Optimization within a Unified Transformation Framework, 1996. ,
A unifying framework for iteration reordering transformations, Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing, 1995. ,
DOI : 10.1109/ICAPP.1995.472180
Code generation for multiple mappings, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation, 1995. ,
DOI : 10.1109/FMPC.1995.380437
The Omega calculator and library, version 1.1.0, 1996. ,
Multilevel tiling: M for the price of one, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, SC '07, pp.1-51, 2007. ,
When polyhedral transformations meet simd code generation, Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation, pp.127-138, 2013. ,
Effective automatic parallelization of stencil computations, Conference on Programming Language Design and Implementation (PLDI), pp.235-244, 2007. ,
Dependence graphs and compiler optimizations, Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pp.207-218, 1981. ,
LLVM: A compilation framework for lifelong program analysis & transformation, International Symposium on Code Generation and Optimization, 2004. CGO 2004., pp.75-86, 2004. ,
DOI : 10.1109/CGO.2004.1281665
A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction, Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, GPGPU '10, 2010. ,
DOI : 10.1145/1735688.1735698
URL : https://hal.archives-ouvertes.fr/inria-00551084
Polylib: A library for manipulating parameterized polyhedra, 1999. ,
Parameterized polyhedra and their vertices Caraco. Parallel computing with generalized cellular automata, International Journal of Parallel Programming, vol.25, issue.6, pp.525-549, 1997. ,
DOI : 10.1023/A:1025117523902
Delinearization, ACM SIGPLAN Notices, vol.27, issue.7, pp.152-161, 1992. ,
DOI : 10.1145/143103.143130
Lazy array data-flow dependence analysis, Proceedings of the 21st ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '94, 1994. ,
DOI : 10.1145/174675.177911
Simplifying polynomial constraints over integers to make dependence analysis more precise, CONPAR 94 -VAPP VI, Int. Conf. on Parallel and Vector Processing, 1994. ,
DOI : 10.1007/3-540-58430-7_64
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.30.7763
A Performance Study for Iterative Stencil Loops on GPUs with Ghost Zone Optimizations, International Journal of Parallel Programming, vol.3, issue.3, pp.115-142, 2011. ,
DOI : 10.1007/s10766-010-0142-5
3.5-D Blocking Optimization for Stencil Computations on Modern CPUs and GPUs, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-13, 2010. ,
DOI : 10.1109/SC.2010.2
URL : https://hal.archives-ouvertes.fr/hal-00865020
Locality Optimization of Stencil Applications Using Data Dependency Graphs, Languages and Compilers for Parallel Computing, pp.77-91, 2011. ,
DOI : 10.1109/TAP.1966.1138693
Parafrase-2: An environment for parallelizing, partitioning , synchronizing, and scheduling programs on multiprocessors, International Journal of High Speed Computing, vol.1, issue.1, pp.45-72, 1989. ,
Induction Variable Analysis with Delayed Abstractions, High Performance Embedded Architectures and Compilers, pp.218-232, 2005. ,
DOI : 10.1007/11587514_15
URL : https://hal.archives-ouvertes.fr/hal-01257294
Graphite: Polyhedral analyses and optimizations for gcc, Proceedings of the 2006 GCC Developers Summit, p.2006, 2006. ,
PolyBench/C 3.2 ,
Interative Optimization in the Polyhedral Model, 2010. ,
Polyopt, a polyhedral optimizer for the rose compiler, 2011. ,
Uniform techniques for loop optimization, Proceedings of the 5th international conference on Supercomputing , ICS '91, pp.341-352, 1991. ,
DOI : 10.1145/109025.109108
An exact method for analysis of value-based array data dependences, 1994. ,
DOI : 10.1007/3-540-57659-2_31
Static analysis of upper and lower bounds on dependences and parallelism, ACM Transactions on Programming Languages and Systems, vol.16, issue.4, pp.1248-1278, 1994. ,
DOI : 10.1145/183432.183525
Generation of efficient nested loops from polyhedra, International Journal of Parallel Programming, vol.28, issue.5, pp.469-498, 2000. ,
DOI : 10.1023/A:1007554627716
Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines, ACM SIG- PLAN Conference on Programming Language Design and Implementation, 2013. ,
Positivity, posynomials and tile size selection, 2008 SC, International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-12, 2008. ,
DOI : 10.1109/SC.2008.5213293
Parameterized tiled loops for free, ACM SIGPLAN Notices, vol.42, issue.6, pp.405-414, 2007. ,
DOI : 10.1145/1273442.1250780
Qualcomm single largest proprietary gpu supplier, imagination technologies the leader in gpu ip, arm and vivante growing rapidly, according to latest report from jon peddie research ,
Blog post: Renderscript part 2, 2011. ,
Automatic blocking of nested loops, 1990. ,
R-stream: A parametric high level compiler, Proceedings of HPEC, 2006. ,
Oil and water can mix! experiences with integrating polyhedral and ast-based transformations, 2013. ,
On the variety of static control parts in real-world programs: from affine via multi-dimensional to polynomial and just-in-time, Proceedings of the 4th International Workshop on Polyhedral Compilation Techniques, 2014. ,
Numerical Solution of Partial Differential Equations: Finite Difference Methods, 2004. ,
Cache oblivious parallelograms in iterative stencil computations, Proceedings of the 24th ACM International Conference on Supercomputing, ICS '10, pp.49-59, 2010. ,
DOI : 10.1145/1810085.1810096
Cache Accurate Time Skewing in Iterative Stencil Computations, 2011 International Conference on Parallel Processing, pp.571-581, 2011. ,
DOI : 10.1109/ICPP.2011.47
Computational electrodynamics: The Finite-difference time-domain method, 1995. ,
The pochoir stencil compiler, Proceedings of the 23rd ACM symposium on Parallelism in algorithms and architectures, SPAA '11, pp.117-128, 2011. ,
DOI : 10.1145/1989493.1989508
GRAPHITE two years after: First lessons learned from eal-world polyhedral compilation, 2nd GCC Research Opportunities Workshop (GROW), 2010. ,
Efficient symbolic analysis for optimizing compilers, Compiler Construction, pp.118-132 ,
Polyhedral Code Generation in the Real World, International Conference on Compiler Construction (CC), pp.185-201, 2006. ,
DOI : 10.1007/11688839_16
URL : https://hal.archives-ouvertes.fr/inria-00001106
Joint scheduling and layout optimization to enable multi-level vectorization, IMPACT, 2012. ,
Scalable Program Optimization Techniques in the Polyhedral Model, 2007. ,
Non-affine Extensions to Polyhedral Code Generation, Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO '14, 2014. ,
DOI : 10.1145/2581122.2544141
isl: An Integer Set Library for the Polyhedral Model, Mathematical Software (ICMS'10), pp.299-302, 2010. ,
DOI : 10.1007/978-3-642-15582-6_49
Counting affine calculator and applications, 1st International Workshop on Polyhedral Compilation Techniques (IMPACT), 2011. ,
Integer sets and relations: from high-level modeling to low-level implementation, 2013. Spring School on Polyhedral Code Analysis and Optimizations ,
Integer set library: Manual -version 0, 2014. ,
Counting Integer Points in Parametric Polytopes Using Barvinok's Rational Functions, Algorithmica, vol.48, issue.1, pp.37-66, 2007. ,
DOI : 10.1007/s00453-006-1231-0
Equivalence checking of static affine programs using widening to handle recurrences Experience with widening based equivalence checking in realistic multimedia systems, Computer Aided Verification 21, pp.599-613279, 2009. ,
Polyhedral parallel code generation for CUDA, ACM Transactions on Architecture and Code Optimization, vol.9, issue.4, pp.1-54, 2013. ,
DOI : 10.1145/2400682.2400713
URL : https://hal.archives-ouvertes.fr/hal-00786677
Suif: An infrastructure for research on parallelizing and optimizing compilers, ACM Sigplan Notices, issue.12, pp.2931-2968, 1994. ,
A data locality optimizing algorithm, ACM Sigplan Notices, vol.26, issue.6, pp.30-44, 1991. ,
A loop transformation theory and an algorithm to maximize parallelism. Parallel and Distributed Systems, IEEE Transactions on, vol.2, issue.4, pp.452-471, 1991. ,
Iteration space tiling for memory hierarchies, Proceedings of the Third SIAM Conference on Parallel Processing for Scientific Computing, pp.357-361, 1987. ,
AlphaZ: A System for Design Space Exploration in the Polyhedral Model, Proceedings of the 25th International Workshop on Languages and Compilers for Parallel Computing, 2012. ,
DOI : 10.1007/978-3-642-37658-0_2
Hierarchical overlapped tiling, Proceedings of the Tenth International Symposium on Code Generation and Optimization, CHO '12, pp.207-218 ,
DOI : 10.1145/2259016.2259044
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.220.9092
Improving polyhedral code generation for high-level synthesis, 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), 2013. ,
DOI : 10.1109/CODES-ISSS.2013.6659002