P. Bibliography, ?. Konrad-trifunovic, A. Cohen, R. Ladelsky, and F. Li, Elimination of Memory-Based Dependences for Loop-Nest Optimization and Parallelization: Evaluation of a Revised Violated Dependence Analysis Method on a Three-Address Code Polyhedral Compiler, 3rd International Workshop on GCC Research Opportunities, 2011.

?. Konrad-trifunovic, A. Cohen, D. Edelsohn, F. Li, T. Grosser et al., GRAPHITE Two Years After: First Lessons Learned From Real-World Polyhedral Compilation, 2nd International Workshop on GCC Research Opportunities, 2010.

?. Konrad-trifunovic, D. Nuzman, A. Cohen, A. Zaks, and I. Rosen, Polyhedral-Model Guided Loop-Nest Auto, International Conference on Parallel Architectures and Compilation Techniques, 2009.

?. Bielecki, K. Trifunovic, and T. Klimek, Calculating Exact Transitive Closure for a Normalized Affine Integer Tuple Relation Other publications A System for Transforming an ANSI C Code with OpenMP Directives into a SystemC Description, Electronic Notes in Discrete Mathematics Proceedings of the 9th IEEE Workshop on Design & Diagnostics of Electronic Circuits & Systems, 2006.

]. F. Agakov, E. Bonilla, J. Cavazos, B. Franke, G. Fursin et al., Using Machine Learning to Focus Iterative Optimization, International Symposium on Code Generation and Optimization (CGO'06), pp.295-305, 2006.
DOI : 10.1109/CGO.2006.37

N. Ahmed, N. Mateev, and K. Pingali, Tiling Imperfectly-nested Loop Nests, ACM/IEEE SC 2000 Conference (SC'00), 2000.
DOI : 10.1109/SC.2000.10018
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.22.491

A. V. Aho, R. Sethi, and J. D. Ullman, Compilers: principles, techniques, and tools, 1986.

C. Alias, F. Baray, and A. Darte, Bee+Cl@k, ACM SIGPLAN Notices, vol.42, issue.7, pp.73-82, 2007.
DOI : 10.1145/1273444.1254778

J. R. Allen and K. Kennedy, Automatic loop interchange, Proceedings of the 1984 SIGPLAN symposium on Compiler construction, SIGPLAN '84, pp.233-246, 1984.
DOI : 10.1145/502874.502897

R. Allen and K. Kennedy, Automatic translation of FORTRAN programs to vector form, ACM Transactions on Programming Languages and Systems, vol.9, issue.4, pp.491-542, 1987.
DOI : 10.1145/29873.29875

R. Allen and K. Kennedy, Optimizing Compilers for Modern Architectures, 2002.

L. Almagor, K. D. Cooper, A. Grosul, T. J. Harvey, S. W. Reeves et al., Finding effective compilation sequences, Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems, LCTES '04, pp.231-239, 2004.
DOI : 10.1145/997163.997196

C. Ancourt and F. Irigoin, Scanning polyhedra with do loops, Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming, PPOPP '91, pp.39-50, 1991.
URL : https://hal.archives-ouvertes.fr/hal-00752774

L. O. Andersen, Program analysis and specialization for the c programming language, 1994.

A. W. Appel, SSA is functional programming, ACM SIGPLAN Notices, vol.33, issue.4, pp.17-20, 1998.
DOI : 10.1145/278283.278285

R. Bagnara, P. M. Hill, and E. Zaffanella, The Parma Polyhedra Library: Toward a complete set of numerical abstractions for the analysis and verification of hardware and software systems, Science of Computer Programming, vol.72, issue.1-2, pp.3-21, 2008.
DOI : 10.1016/j.scico.2007.08.001

U. Banerjee, Data dependence in ordinary programs Master's thesis, 1976.

U. Banerjee, Dependence Analysis for Supercomputing. Kluwer Academic, 1988.

U. Banerjee, Unimodular transformations of double loops, Advances in Languages and Compilers for Parallel Processing, pp.192-219, 1990.

U. Banerjee, Loop Transformations for Restructuring Compilers, Kluwer Academic, 1993.

D. Barthou, A. Cohen, and J. Collard, Maximal static expansion, Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '98, pp.213-243, 2000.
DOI : 10.1145/268946.268955
URL : https://hal.archives-ouvertes.fr/hal-01257319

D. Barthou, J. Collard, and P. Feautrier, Fuzzy Array Dataflow Analysis, Journal of Parallel and Distributed Computing, vol.40, issue.2, pp.210-226, 1997.
DOI : 10.1006/jpdc.1996.1261
URL : https://hal.archives-ouvertes.fr/hal-00551673

A. Barvinok, A Polynomial Time Algorithm for Counting Integral Points in Polyhedra When the Dimension is Fixed, Mathematics of Operations Research, vol.19, issue.4, pp.769-779, 1994.
DOI : 10.1287/moor.19.4.769

C. Bastoul, Cloog: The chunky loop generator

C. Bastoul, Efficient code generation for automatic parallelization and optimization, Second International Symposium on Parallel and Distributed Computing, 2003. Proceedings., pp.23-30, 2003.
DOI : 10.1109/ISPDC.2003.1267639
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.621.1073

C. Bastoul, Code generation in the polyhedral model is easier than you think, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004., pp.7-16, 2004.
DOI : 10.1109/PACT.2004.1342537
URL : https://hal.archives-ouvertes.fr/hal-00017260

C. Bastoul, A. Cohen, S. Girbal, S. Sharma, and O. Temam, Putting Polyhedral Loop Transformations to Work, LCPC'16 Intl. Workshop on Languages and Compilers for Parallel Computing, pp.209-225, 2003.
DOI : 10.1007/978-3-540-24644-2_14
URL : https://hal.archives-ouvertes.fr/inria-00071681

A. Beletska, D. Barthou, W. Bielecki, and A. Cohen, Computing the Transitive Closure of a Union of Affine Integer Tuple Relations, Proceedings of the 3rd International Conference on Combinatorial Optimization and Applications, COCOA '09, pp.98-109, 2009.
DOI : 10.1007/978-3-642-02026-1_9
URL : https://hal.archives-ouvertes.fr/hal-00575959

A. Beletska, W. Bielecki, A. Cohen, M. Palkowski, and K. Siedlecki, Coarse-grained loop parallelization: Iteration space slicing vs affine transformations, Proceedings of the 2009 Eighth International Symposium on Parallel and Distributed Computing, ISPDC '09, pp.73-80, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00645329

M. Benabderrahmane, L. Pouchet, A. Cohen, and C. Bastoul, The Polyhedral Model Is More Widely Applicable Than You Think, Lecture Notes in Computer Science, vol.6011, pp.283-303, 2010.
DOI : 10.1007/978-3-642-11970-5_16
URL : https://hal.archives-ouvertes.fr/inria-00551087

A. Bernstein, Analysis of Programs for Parallel Processing, IEEE Transactions on Electronic Computers, vol.15, issue.5, pp.757-763, 1966.
DOI : 10.1109/PGEC.1966.264565

W. Bielecki, T. Klimek, and K. Trifunovic, Calculating Exact Transitive Closure for a Normalized Affine Integer Tuple Relation, Electronic Notes in Discrete Mathematics, vol.33, pp.7-14, 2009.
DOI : 10.1016/j.endm.2009.03.002

A. J. Bik, The Software Vectorization Handbook. Applying Multimedia Extensions for Maximum Performance, 2004.

A. J. Bik, M. Girkar, P. M. Grey, and X. Tian, Automatic intra-register vectorization for the Intel architecture, IJPP, vol.30, issue.2, pp.65-98, 2002.

W. Blume, R. Eigenmann, K. Faigin, J. Grout, J. Hoeflinger et al., Parallel programming with Polaris, Computer, vol.29, issue.12, pp.78-82, 1996.
DOI : 10.1109/2.546612

U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev et al., Automatic Transformations for Communication-Minimized Parallelization and Locality Optimization in the Polyhedral Model, Proceedings of the Joint European Conferences on Theory and Practice of Software 17th international conference on Compiler construction, CC'08/ETAPS'08, pp.132-146, 2008.
DOI : 10.1007/978-3-540-78791-4_9

U. Bondhugula, O. Gunluk, S. Dash, and L. Renganarayanan, A model for fusion and code motion in an automatic parallelizing compiler, Proceedings of the 19th international conference on Parallel architectures and compilation techniques, PACT '10, pp.343-352, 2010.
DOI : 10.1145/1854273.1854317

U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan, A practical automatic polyhedral parallelizer and locality optimizer, Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation, PLDI '08, pp.101-113, 2008.

U. K. Bondhugula, Effective automatic parallelization and locality optimization using the polyhedral model, 2008.

P. Boulet, A. Darte, T. Risset, and Y. Robert, (Pen)-ultimate tiling? In IEEE Scalable High- Performance Computing Conf, 1994.

P. Calland, A. Darte, Y. Robert, and F. Vivien, On the removal of anti and output dependences, Proceedings of International Conference on Application Specific Systems, Architectures and Processors: ASAP '96, pp.285-312, 1998.
DOI : 10.1109/ASAP.1996.542829
URL : https://hal.archives-ouvertes.fr/inria-00073890

L. Carter, J. Ferrante, and C. Thomborson, Folklore confirmed: reducible flow graphs are exponentially larger, Proceedings of the 30th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, POPL '03, pp.106-114, 2003.

C. Cascaval, L. Derose, D. A. Padua, and D. A. Reed, Compile-Time Based Performance Prediction, LCPC, 1999.
DOI : 10.1007/3-540-44905-1_23
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.9836

J. Cavazos, C. Dubach, F. Agakov, E. Bonilla, M. F. O-'boyle et al., Automatic performance model construction for the fast software exploration of new hardware designs, Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems , CASES '06, pp.24-34, 2006.
DOI : 10.1145/1176760.1176765

C. Chen, J. Chame, and M. Hall, CHiLL: A framework for composing high-level loop transformations, 2008.

N. Chernikova, Algorithm for finding a general formula for the non-negative solutions of a system of linear inequalities, USSR Computational Mathematics and Mathematical Physics, vol.5, issue.2, 1965.
DOI : 10.1016/0041-5553(65)90045-5

P. Clauss, Counting solutions to linear and nonlinear constraints through Ehrhart polynomials: applications to analyze and transform scientific programs, Intl. Conf. on Supercomputing, pp.278-285, 1996.
URL : https://hal.archives-ouvertes.fr/hal-01100306

C. Coarfa, F. Zhao, N. Tallent, J. Mellor-crummey, and Y. Dotsenko, Open-source compiler technology for source-to-source optimization

A. Cohen, Parallelization via constrained storage mapping optimization, Proceedings of the Second International Symposium on High Performance Computing, ISHPC '99, pp.83-94, 1999.
DOI : 10.1007/BFb0094913
URL : https://hal.archives-ouvertes.fr/hal-01257317

A. Cohen, S. Girbal, D. Parello, M. Sigler, O. Temam et al., Facilitating the search for compositions of program transformations, Proceedings of the 19th annual international conference on Supercomputing , ICS '05, pp.151-160, 2005.
DOI : 10.1145/1088149.1088169
URL : https://hal.archives-ouvertes.fr/hal-01257296

A. Cohen, S. Girbal, and O. Temam, A Polyhedral Approach to Ease the Composition of Program Transformations, Euro-Par, pp.292-303, 2004.
DOI : 10.1007/978-3-540-27866-5_38
URL : https://hal.archives-ouvertes.fr/hal-01257301

A. Cohen and V. Lefebvre, Storage Mapping Optimization for Parallel Programs, Proceedings of the 5th International Euro-Par Conference on Parallel Processing, Euro-Par '99, pp.375-382, 1999.
DOI : 10.1007/3-540-48311-X_49
URL : https://hal.archives-ouvertes.fr/hal-01257316

E. Cohen and N. Megiddo, Improved Algorithms For Linear Inequalities with Two Variables Per Inequality, SIAM Journal on Computing, vol.23, issue.6, pp.1313-1350, 1994.
DOI : 10.1137/S0097539791256325

K. D. Cooper, M. W. Hall, R. T. Hood, K. Kennedy, K. S. Mckinley et al., The ParaScope parallel programming environment, Proceedings of the IEEE, pp.244-263, 1993.
DOI : 10.1109/5.214549

P. Cousot and N. Halbwachs, Automatic discovery of linear restraints among variables of a program, Proceedings of the 5th ACM SIGACT-SIGPLAN symposium on Principles of programming languages , POPL '78, pp.84-97, 1978.
DOI : 10.1145/512760.512770

R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck, An efficient method of computing static single assignment form, Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '89, pp.25-35, 1989.
DOI : 10.1145/75277.75280

A. Darte, On the complexity of loop fusion, Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques, PACT '99, p.149, 1999.

A. Darte and G. Huard, Loop Shifting for Loop Compaction, Int. J. Parallel Program, vol.28, pp.499-534, 2000.
DOI : 10.1007/3-540-44905-1_26

A. Darte and G. Huard, Complexity of Multi-dimensional Loop Alignment, Proceedings of the 19th Annual Symposium on Theoretical Aspects of Computer Science, STACS '02, pp.179-191, 2002.
DOI : 10.1007/3-540-45841-7_14

A. Darte and G. Huard, New Complexity Results on Array Contraction and Related Problems, Journal of VLSI signal processing systems for signal, image and video technology, vol.24, issue.3/4, pp.35-55, 2005.
DOI : 10.1007/s11265-005-4937-3

A. Darte, Y. Robert, and F. Vivien, Scheduling and Automatic Parallelization, 2000.
DOI : 10.1007/978-1-4612-1362-8
URL : https://hal.archives-ouvertes.fr/hal-00856645

A. Darte, R. Schreiber, and G. Villard, Lattice-Based Memory Allocation, IEEE Transactions on Computers, vol.54, issue.10, pp.1242-1257, 2005.
DOI : 10.1109/TC.2005.167
URL : https://hal.archives-ouvertes.fr/hal-01272969

A. Darte and G. Silber, Temporary Arrays for Distribution of Loops with Control Dependences, Proceedings from the 6th International Euro-Par Conference on Parallel Processing, Euro-Par '00, pp.357-367, 2000.
DOI : 10.1007/3-540-44520-X_47

A. Darte, G. Silber, and F. Vivien, Combining Retiming and Scheduling Techniques for Loop Parallelization and Loop Tiling, Parallel Processing Letters, vol.07, issue.04, pp.379-392, 1997.
DOI : 10.1142/S0129626497000383
URL : https://hal.archives-ouvertes.fr/hal-00856890

A. Darte and F. Vivien, On the optimality of allen and kennedy's algorithm for parallel extraction in nested loops, Proceedings of the Second International Euro-Par Conference on Parallel Processing -Volume I, Euro-Par '96, pp.379-388, 1996.

P. K. Dubey, G. B. Adams, I. , and M. J. Flynn, Evaluating performance tradeoffs between fine-grained and coarse-grained alternatives, IEEE Transactions on Parallel and Distributed Systems, vol.6, issue.1, pp.17-27, 1995.
DOI : 10.1109/71.363414

R. V. Engelen, Efficient Symbolic Analysis for Optimizing Compilers, Proceedings of the 10th International Conference on Compiler Construction, CC '01, pp.118-132, 2001.
DOI : 10.1007/3-540-45306-7_9

P. Feautrier, Array expansion, Proceedings of the 2nd international conference on Supercomputing , ICS '88, pp.429-441, 1988.
DOI : 10.1145/2591635.2667159
URL : https://hal.archives-ouvertes.fr/hal-01099746

P. Feautrier, Parametric integer programming, RAIRO - Operations Research, vol.22, issue.3, pp.243-268, 1988.
DOI : 10.1051/ro/1988220302431
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.30.9957

P. Feautrier, Dataflow analysis of array and scalar references, International Journal of Parallel Programming, vol.24, issue.4, pp.23-53, 1991.
DOI : 10.1007/BF01407931

P. Feautrier, Some efficient solutions to the affine scheduling problem. I. One-dimensional time, International Journal of Parallel Programming, vol.40, issue.6, pp.313-348, 1992.
DOI : 10.1007/BF01407835

P. Feautrier, Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time, International Journal of Parallel Programming, vol.2, issue.4, pp.389-420, 1992.
DOI : 10.1007/BF01379404

B. B. Fraguela, R. Doallo, and E. L. Zapata, Probabilistic miss equations: evaluating memory hierarchy performance, IEEE Transactions on Computers, vol.52, issue.3, pp.321-336, 2003.
DOI : 10.1109/TC.2003.1183947

G. R. Gao, R. Olsen, V. Sarkar, and R. Thekkath, Collective loop fusion for array contraction, Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing, pp.281-295, 1993.
DOI : 10.1007/3-540-57502-2_53
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.42.3536

S. Girbal, N. Vasilache, C. Bastoul, A. Cohen, D. Parello et al., Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies, International Journal of Parallel Programming, vol.20, issue.1, pp.261-317, 2006.
DOI : 10.1007/s10766-006-0012-3
URL : https://hal.archives-ouvertes.fr/hal-01257288

O. Golovanevsky, A. Dayan, A. Zaks, and D. Edelsohn, Trace-Based Data Layout Optimizations for Multi-core Processors, High Performance Embedded Architectures and Compilers, pp.81-95, 2010.
DOI : 10.1007/978-3-642-11515-8_8

M. Griebl, C. Lengauer, and S. Wetzel, Code generation in the polytope model, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192), p.106, 1998.
DOI : 10.1109/PACT.1998.727179

A. Groesslinger, The Challenges of Non-linear Parameters and Variables in Automatic Loop Parallelisation, Lulu Enterprises, 2010.

T. Grosser, H. Zheng, R. Aloor, A. Simburger, A. Groesslinger et al., Polly -polyhedral optimization in llvm, IMPACT 2011 First International Workshop on Polyhedral Compilation Techniques, 2011.

G. Gupta, D. Kim, and S. V. Rajopadhye, Scheduling in the z-polyhedral model, IPDPS, pp.1-10, 2007.

M. Hall, Maximizing multiprocessor performance with the SUIF compiler, Computer, vol.29, issue.12, pp.84-89, 1996.
DOI : 10.1109/2.546613

M. Hall, D. Padua, and K. Pingali, Compiler research, Communications of the ACM, vol.52, issue.2, pp.60-67, 2009.
DOI : 10.1145/1461928.1461946

A. Hartono, M. M. Baskaran, C. Bastoul, A. Cohen, S. Krishnamoorthy et al., Parametric multi-level tiling of imperfectly nested loops, Proceedings of the 23rd international conference on Conference on Supercomputing, ICS '09, pp.147-157, 2009.
DOI : 10.1145/1542275.1542301
URL : https://hal.archives-ouvertes.fr/hal-00645328

F. Irigoin, P. Jouvelot, and R. Triolet, Semantical interprocedural parallelization: An overview of the pips project, Intl. Conf. on Supercomputing (ICS'91), 1991.
URL : https://hal.archives-ouvertes.fr/hal-00984684

F. Irigoin and R. Triolet, Supernode partitioning, Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '88, pp.319-329, 1988.
DOI : 10.1145/73560.73588

R. Karp, R. Miller, and S. Winograd, The Organization of Computations for Uniform Recurrence Equations, Journal of the ACM, vol.14, issue.3, pp.563-590, 1967.
DOI : 10.1145/321406.321418

W. Kelly, Optimization within a unified transformation framework, 1996.

W. Kelly, V. Maslov, W. Pugh, E. Rosser, T. Shpeisman et al., The omega library interface guide, 1995.

W. Kelly and W. Pugh, Finding legal reordering transformations using mappings, Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing, LCPC '94, pp.107-124, 1995.
DOI : 10.1007/BFb0025874

W. Kelly and W. Pugh, Minimizing communication while preserving parallelism, Proceedings of the 10th international conference on Supercomputing , ICS '96, 1995.
DOI : 10.1145/237578.237585
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.22.5697

W. Kelly and W. Pugh, A unifying framework for iteration reordering transformations, Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing, 1995.
DOI : 10.1109/ICAPP.1995.472180

W. Kelly, W. Pugh, and E. Rosser, Code generation for multiple mappings, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation, p.332, 1995.
DOI : 10.1109/FMPC.1995.380437

K. Kennedy and K. Mckinley, Maximizing loop parallelism and improving data locality via loop fusion and distribution, Languages and Compilers for Parallel Computing, pp.301-320, 1993.
DOI : 10.1007/3-540-57659-2_18
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.4749

K. Kennedy and K. S. Mckinley, Loop distribution with arbitrary control flow, Proceedings SUPERCOMPUTING '90, pp.407-416, 1990.
DOI : 10.1109/SUPERC.1990.130048
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.374.2631

K. Knobe and V. Sarkar, Array SSA form and its use in parallelization, Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '98, pp.107-120, 1998.
DOI : 10.1145/268946.268956

D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe, Dependence graphs and compiler optimizations, Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '81, pp.207-218, 1981.
DOI : 10.1145/567532.567555

L. Lamport, The parallel execution of DO loops, Communications of the ACM, vol.17, issue.2, pp.83-93, 1974.
DOI : 10.1145/360827.360844

H. and L. Verge, A note on Chernikova's algorithm, 1998.

V. Lefebvre and P. Feautrier, Automatic storage management for parallel programs, Parallel Computing, vol.24, issue.3-4, pp.649-671, 1998.
DOI : 10.1016/S0167-8191(98)00029-5
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.16.5796

W. Li and K. Pingali, A singular loop transformation framework based on non-singular matrices, Lecture Notes in Computer Science, vol.757, pp.391-405, 1992.
DOI : 10.1007/3-540-57502-2_60

A. Lim, Improving Parallelism and Data Locality with Affine Partitioning, 2001.

A. Lim and M. S. Lam, Communication-free parallelization via affine transformations, Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing, LCPC '94, pp.92-106, 1995.
DOI : 10.1007/BFb0025873
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.6303

A. W. Lim, G. I. Cheong, and M. S. Lam, An affine partitioning algorithm to maximize parallelism and minimize communication, Proceedings of the 13th international conference on Supercomputing , ICS '99, pp.228-237, 1999.
DOI : 10.1145/305138.305197

A. W. Lim and M. S. Lam, Maximizing parallelism and minimizing synchronization with affine transforms, Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '97, pp.201-214, 1997.
DOI : 10.1145/263699.263719

V. Loechner and D. K. Wilde, Parameterized polyhedra and their vertices, International Journal of Parallel Programming, vol.25, issue.6, pp.525-549, 1997.
DOI : 10.1023/A:1025117523902
URL : https://hal.archives-ouvertes.fr/inria-00534851

Q. Lu, C. Alias, U. Bondhugula, T. Henretty, S. Krishnamoorthy et al., Data Layout Transformation for Enhancing Data Locality on NUCA Chip Multiprocessors, 2009 18th International Conference on Parallel Architectures and Compilation Techniques, pp.348-357, 2009.
DOI : 10.1109/PACT.2009.36

D. E. Maydan, S. P. Amarasinghe, and M. S. Lam, Array-data flow analysis and its use in array privatization, Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '93, pp.2-15, 1993.
DOI : 10.1145/158511.158515

N. Megiddo and V. Sarkar, Optimal weighted loop fusion for parallel programs, Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures , SPAA '97, pp.282-291, 1997.
DOI : 10.1145/258492.258520

M. Mehrara, J. Hao, P. Hsu, and S. Mahlke, Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory, Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation, PLDI '09, pp.166-176, 2009.

B. Meister, A. Leung, N. Vasilache, D. Wohlford, C. Bastoul et al., Productivity via automatic code generation for pgas platforms with the r-stream compiler, APGAS'09 Workshop on Asynchrony in the PGAS Programming Model, 2009.

S. Muchnick, Advanced Compiler Design and Implementation, 1997.

D. Naishlos, Autovectorization in gcc, the GCC Developer's summit, pp.105-118, 2004.

D. Nuzman, I. Rosen, and A. Zaks, Auto-vectorization of interleaved data for simd, PLDI, 2006.

D. Nuzman and A. Zaks, Autovectorization in GCC ? two years later, the GCC Developer's summit, 2006.

D. Nuzman and A. Zaks, Outer-loop vectorization -revisited for short SIMD architectures, PACT, 2008.

M. O. Boyle, MARS: a distributed memory approach to shared memory compilation, Proc. Language , Compilers and Runtime Systems for Scalable Computing, 1998.

D. A. Padua and M. J. Wolfe, Advanced compiler optimizations for supercomputers, Communications of the ACM, vol.29, issue.12, pp.1184-1201, 1986.
DOI : 10.1145/7902.7904

E. Park, L. Pouchet, J. Cavazos, A. Cohen, and P. Saddayapan, Predictive Modeling in a Polyhedral Optimization Space, International Symposium on Code Generation and Optimization Chamonix France, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00551076

D. J. Pearce, P. H. Kelly, and C. Hankin, Efficient field-sensitive pointer analysis of C, ACM Transactions on Programming Languages and Systems, vol.30, issue.1, p.4, 2007.
DOI : 10.1145/1290520.1290524

S. Pop, A. Cohen, C. Bastoul, S. Girbal, P. Jouvelot et al., Graphite: Loop optimizations based on the polyhedral model for GCC, Proc. of the 4þ GCC Developper's Summit, 2006.
URL : https://hal.archives-ouvertes.fr/hal-01257284

S. Pop, A. Cohen, and G. Silber, Induction Variable Analysis with Delayed Abstractions, Lecture Notes in Computer Science, vol.3793, pp.218-232, 2005.
DOI : 10.1007/11587514_15
URL : https://hal.archives-ouvertes.fr/hal-01257294

B. Pottenger and R. Eigenmann, Idiom recognition in the Polaris parallelizing compiler, Proceedings of the 9th international conference on Supercomputing , ICS '95, pp.444-448, 1995.
DOI : 10.1145/224538.224655

L. Pouchet, C. Bastoul, A. Cohen, and J. Cavazos, Iterative optimization in the polyhedral model: part ii, multidimensional time, Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation, PLDI '08, pp.90-100, 2008.
URL : https://hal.archives-ouvertes.fr/hal-01257273

L. Pouchet, C. Bastoul, A. Cohen, and N. Vasilache, Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time, International Symposium on Code Generation and Optimization (CGO'07), pp.144-156, 2007.
DOI : 10.1109/CGO.2007.21
URL : https://hal.archives-ouvertes.fr/hal-01257281

L. Pouchet, U. Bondhugula, C. Bastoul, A. Cohen, J. Ramanujam et al., Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-11, 2010.
DOI : 10.1109/SC.2010.14
URL : https://hal.archives-ouvertes.fr/inria-00551067

L. Pouchet, U. Bondhugula, C. Bastoul, A. Cohen, J. Ramanujam et al., Loop transformations: convexity, pruning and optimization, Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pp.549-562, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00551077

W. Pugh, The Omega test: a fast and practical integer programming algorithm for dependence analysis, Proceedings of the 1991 ACM/IEEE conference on Supercomputing , Supercomputing '91, pp.4-13, 1991.
DOI : 10.1145/125826.125848

W. Pugh, Uniform techniques for loop optimization, Proceedings of the 5th international conference on Supercomputing , ICS '91, pp.341-352, 1991.
DOI : 10.1145/109025.109108

W. Pugh, Counting solutions to presburger formulas: how and why, Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation, PLDI '94, pp.121-134, 1994.

W. Pugh and D. Wonnacott, Eliminating false data dependences using the omega test, Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation , PLDI '92, pp.140-151, 1992.

W. Pugh and D. Wonnacott, An exact method for analysis of value-based array data dependences, Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing, pp.546-566, 1994.
DOI : 10.1007/3-540-57659-2_31

J. Ramanujam, Beyond unimodular transformations, The Journal of Supercomputing, vol.3, issue.5, pp.365-389, 1995.
DOI : 10.1007/BF01206273
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.5188

X. Redon and P. Feautrier, Scheduling reductions, Proceedings of the 8th international conference on Supercomputing , ICS '94, pp.117-125, 1994.
DOI : 10.1145/181181.181319

L. Renganarayana, U. Bondhugula, S. Derisavi, A. E. Eichenberger, and K. O-'brien, Compact multi-dimensional kernel extraction for register tiling, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, pp.1-45, 2009.
DOI : 10.1145/1654059.1654105

G. Roth and K. Kennedy, Loop fusion in high performance Fortran, Proceedings of the 12th international conference on Supercomputing , ICS '98, pp.125-132, 1998.
DOI : 10.1145/277830.277857

S. Rus, M. Pennings, and L. Rauchwerger, Sensitivity analysis for automatic parallelization on multi-cores, Proceedings of the 21st annual international conference on Supercomputing, ICS '07, pp.263-273, 2007.
DOI : 10.1145/1274971.1275008

A. Schrijver, Theory of linear and integer programming, 1986.

J. Shin, J. Chame, and M. W. Hall, Compiler-controlled caching in superword register files for multimedia extension architectures, PACT, 2002.

J. Shin, M. Hall, and J. Chame, Superword-level parallelism in the presence of control flow, CGO, 2005.

W. Thies, F. Vivien, J. Sheldon, and S. Amarasinghe, A unified framework for schedule and storage optimization, Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation, PLDI '01, pp.232-242, 2001.
URL : https://hal.archives-ouvertes.fr/hal-00808285

A. Tiwari, C. Chen, J. Chame, M. Hall, and J. K. Hollingsworth, A scalable autotuning framework for computer optimization, IPDPS'09, 2009.

K. Trifunovic and A. Cohen, Enabling more optimizations in graphite: ignoring memory based dependences, In GCC Summit, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00551509

K. Trifunovic, A. Cohen, D. Edelsohn, F. Li, T. Grosser et al., Graphite two years after: First lessons learned from real-world polyhedral compilation, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00551516

K. Trifunovic, D. Nuzman, A. Cohen, A. Zaks, and I. Rosen, Polyhedral-Model Guided Loop-Nest Auto-Vectorization, 2009 18th International Conference on Parallel Architectures and Compilation Techniques, 2009.
DOI : 10.1109/PACT.2009.18
URL : https://hal.archives-ouvertes.fr/hal-00645325

R. Triolet, F. Irigoin, and P. Feautrier, Direct parallelization of call statements, Proceedings of the 1986 SIGPLAN symposium on Compiler construction, SIGPLAN '86, pp.176-185, 1986.

P. Tu and D. Padua, Array privatization for shared and distributed memory machines (extended abstract). SIGPLAN Not, pp.64-67, 1993.

P. Tu and D. A. Padua, Automatic array privatization, Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing, pp.500-521, 1994.
DOI : 10.1007/3-540-45403-9_8
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.3.5746

R. Upadrasta and A. Cohen, Potential and challenges of two-variable-per-inequality subpolyhedral compilation, IMPACT 2011 First International Workshop on Polyhedral Compilation Techniques, 2011.

N. Vasilache, C. Bastoul, and A. Cohen, Polyhedral Code Generation in the Real World, Lecture Notes in Computer Science, vol.3923, pp.185-201, 2006.
DOI : 10.1007/11688839_16
URL : https://hal.archives-ouvertes.fr/inria-00001106

N. Vasilache, C. Bastoul, A. Cohen, and S. Girbal, Violated dependence analysis, Proceedings of the 20th annual international conference on Supercomputing , ICS '06, pp.335-344, 2006.
DOI : 10.1145/1183401.1183448
URL : https://hal.archives-ouvertes.fr/hal-01257290

N. Vasilache, A. Cohen, and L. Pouchet, Automatic Correction of Loop Transformations, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007), pp.292-304, 2007.
DOI : 10.1109/PACT.2007.4336220
URL : https://hal.archives-ouvertes.fr/hal-01257283

S. Verdoolaege, isl: An Integer Set Library for the Polyhedral Model, Mathematical Software -ICMS 2010, pp.299-302, 2010.
DOI : 10.1007/978-3-642-15582-6_49

F. Vivien, On the optimality of feautrier's scheduling algorithm, Proceedings of the 8th International Euro-Par Conference on Parallel Processing, Euro-Par '02, pp.299-308, 2002.

D. K. Wilde, A LIBRARY FOR DOING POLYHEDRAL OPERATIONS, Parallel Algorithms and Applications, vol.15, issue.3-4, 1993.
DOI : 10.1007/BF02574699
URL : https://hal.archives-ouvertes.fr/inria-00074515

M. E. Wolf and M. S. Lam, A loop transformation theory and an algorithm to maximize parallelism, IEEE Transactions on Parallel and Distributed Systems, vol.2, issue.4, pp.452-471, 1991.
DOI : 10.1109/71.97902

M. Wolfe, Iteration space tiling for memory hierarchies, 3rd SIAM Conf. on Parallel Processing for Scientific Computing, pp.357-361, 1987.

M. Wolfe, High performance compilers for parallel computing, 1996.

P. Wu, A. E. Eichenberger, A. Wang, and P. Zhao, An integrated simdization framework using virtual vectors, Proceedings of the 19th annual international conference on Supercomputing , ICS '05, 2005.
DOI : 10.1145/1088149.1088172

J. Xue, Automating non-unimodular loop transformations for massive parallelism, Parallel Computing, vol.20, issue.5, pp.711-728, 1994.
DOI : 10.1016/0167-8191(94)90002-7

J. Xue, Unimodular transformations of non-perfectly nested loops, Parallel Computing, vol.22, issue.12, pp.1621-1645, 1997.
DOI : 10.1016/S0167-8191(96)00063-4

Y. Yang, C. Ancourt, and F. Irigoin, Minimal data dependence abstractions for loop transformations, Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing, LCPC '94, pp.201-216, 1995.
DOI : 10.1007/BFb0025880