F. Agakov, E. Bonilla, J. Cavazos, B. Franke, G. Fursin et al., Using Machine Learning to Focus Iterative Optimization, International Symposium on Code Generation and Optimization (CGO'06), 2006.
DOI : 10.1109/CGO.2006.37

A. [. Amiranoff, P. Cohen, and . Feautrier, Instancewise array dependence test for recursive programs
URL : https://hal.archives-ouvertes.fr/hal-01257308

L. Almagor, K. D. Cooper, A. Grosul, T. J. Harvey, S. W. Reeves et al., Finding effective compilation sequences, Proc. Languages, Compilers, and Tools for Embedded Systems (LCTES), pp.231-239, 2004.

D. E. Allen, V. Chase, C. Luchangco, J. Flood, S. Maessen et al., The fortress language specification 0.866, 2006.

. I. Acm-+-98-]-d, D. A. August, S. A. Connors, J. W. Mahlke, K. M. Sias et al., Integrated predicated and speculative execution in the IMPACT EPIC architecture, Proceedings of the 25th Intl. Symp. on Computer Architecture, 1998.

F. [. Ancourt and . Irigoin, Scanning polyhedra with DO loop, ACM Symp. on Principles and Practice of Parallel Programming (PPoPP'91), pp.39-50, 1991.

A. P. An, S. Jula, S. Rus, T. Saunders, . Smith et al., STAPL: An Adaptive, Generic Parallel C++ Library, Languages and Compilers for Parallel Computing (LCPC'01), pp.193-208, 2001.
DOI : 10.1007/3-540-35767-X_13

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.207.3775

K. [. Allen and . Kennedy, Automatic translation of FORTRAN programs to vector form, ACM Transactions on Programming Languages and Systems, vol.9, issue.4, pp.491-542, 1987.
DOI : 10.1145/29873.29875

K. [. Allen and . Kennedy, Optimizing Compilers for Modern Architectures, 2002.

]. P. Ami04 and . Amiranoff, An Automata-Theoretic Modelization of Instancewise Program Analysis: Transducers as mappings from Instances to Memory Locations, 2004.

N. [. Ahmed, K. Mateev, and . Pingali, Synthesizing transformations for locality enhancement of imperfectly-nested loop nests, ACM Supercomputing'00, 2000.

E. [. Aiken and . Wimmers, Type inclusion constraints and type inference, Proceedings of the conference on Functional programming languages and computer architecture , FPCA '93, pp.31-41, 1993.
DOI : 10.1145/165180.165188

J. Bilmes, K. Asanovi´casanovi´c, C. W. Chin, and J. Demmel, Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology, ACM Intl. Conf. on Supercomputing (ICS'97), pp.340-347, 1997.

]. U. Ban88 and . Banerjee, Dependence Analysis for Supercomputing, 1988.

]. U. Ban92 and . Banerjee, Loop Transformations for Restructuring Compilers: The Foundations, 1992.

]. D. Bar98 and . Barthou, Array Dataflow Analysis in Presence of Non-affine Constraints, 1998.

]. C. Bas03 and . Bastoul, Efficient code generation for automatic parallelization and optimization, ISPDC'2 IEEE International Symposium on Parallel and Distributed Computing, 2003.

M. Barreteau, F. Bodin, Z. Chamski, H. Charles, C. Eisenbeis et al., Oceans -optimising compilers for embedded applications, Euro-Par'99, pp.1171-1775, 1999.

[. Bell, W. Chen, D. Bonachea, and K. Yelick, Evaluating support for global address space languages on the Cray X1, Proceedings of the 18th annual international conference on Supercomputing , ICS '04, 2004.
DOI : 10.1145/1006209.1006236

A. [. Barthou, J. Cohen, and . Collard, Maximal static expansion, Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '98, pp.98-106, 1998.
DOI : 10.1145/268946.268955

URL : https://hal.archives-ouvertes.fr/hal-01257319

A. [. Barthou, J. Cohen, and . Collard, Maximal static expansion, Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '98, pp.213-243, 2000.
DOI : 10.1145/268946.268955

URL : https://hal.archives-ouvertes.fr/hal-01257319

]. A. Bce-+-03, P. Benveniste, S. A. Caspi, N. Edwards, P. L. Halbwachs et al., The synchronous languages 12 years later, Proceedings of the IEEE, 2003.

J. [. Barthou, P. Collard, and . Feautrier, Fuzzy Array Dataflow Analysis, Journal of Parallel and Distributed Computing, vol.40, issue.2, pp.210-226, 1997.
DOI : 10.1006/jpdc.1996.1261

URL : https://hal.archives-ouvertes.fr/hal-00551673

S. Cohen, S. Girbal, O. Sharma, and . Temam, Putting polyhedral loop transformations to work, Languages and Compilers for Parallel Computing (LCPC'03), pp.23-30, 2003.
URL : https://hal.archives-ouvertes.fr/inria-00071681

]. W. Bef-+-96, R. Blume, K. Eigenmann, J. Faigin, J. Grout et al., Parallel programming with Polaris, IEEE Computer, issue.12, pp.2978-82, 1996.

]. J. Ber79 and . Berstel, Transductions and Context-Free Languages. Teubner, 1979.

]. G. Ber00 and . Berry, The Foundations of Esterel, 2000.

P. [. Bastoul and . Feautrier, Improving Data Locality by Chunking, CC Intl. Conf. on Compiler Construction, number 2622 in LNCS, pp.320-335, 2003.
DOI : 10.1007/3-540-36579-6_23

URL : https://hal.archives-ouvertes.fr/inria-00001055

P. [. Bastoul and . Feautrier, More Legal Transformations for Locality, Euro-Par'10, number 3149 in LNCS, pp.272-283, 2004.
DOI : 10.1007/978-3-540-27866-5_36

URL : https://hal.archives-ouvertes.fr/inria-00001056

[. Bastoul and P. Feautrier, Adjusting a program transformation for legality. Parallel processing letters, pp.3-17, 2005.

B. [. Thomson, M. O. Franke, G. Boyle, and . Fursin, Probabilistic source-level optimisation of embedded systems software, ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'05), 2005.

A. J. Bik, M. Girkar, P. M. Grey, and X. Tian, Automatic intra-register vectorization for the intel architecture, International Journal of Parallel Programming, vol.30, issue.2, pp.65-98, 2002.
DOI : 10.1023/A:1014230429447

J. Buck, S. Ha, E. A. Lee, and D. G. Messerschmitt, Ptolemy: A Framework for Simulating and Prototyping Heterogeneous Systems, Int. J. in Computer Simulation, vol.4, issue.2, pp.155-182, 1994.
DOI : 10.1016/B978-155860702-6/50048-X

]. F. Bkk-+-98, T. Bodin, P. M. Kisuki, M. F. Knijnenburg, E. O-'boyle et al., Iterative compilation in a non-linear optimisation space, Proc. Workshop on Profile and Feedback Directed Compilation, 1998.

W. M. Brunel, H. J. Kruijtzer, F. Kenter, L. Pétrot, E. A. Pasquier et al., COSY communication IP's, Proceedings of the 37th conference on Design automation , DAC '00, pp.406-409, 2000.
DOI : 10.1145/337292.337515

P. [. Benveniste, C. Guernic, and . Jacquemot, Synchronous programming with events and relations: the SIGNAL language and its semantics, Science of Computer Programming, vol.16, issue.2, pp.103-149, 1991.
DOI : 10.1016/0167-6423(91)90001-E

]. F. Bou92 and . Bourdoncle, Abstract interpretation by dynamic partitioning, J. of Functional Programming, vol.2, issue.4, pp.407-423, 1992.

E. [. Bodin, A. Rohou, and . Seznec, Salto: System for assembly-language transformation and optimization, Workshop on Compilers for Parallel Computers (CPC'96), 1996.
URL : https://hal.archives-ouvertes.fr/inria-00073718

E. [. Bagnara, E. Ricci, and . Zaffanella, Precise widening operators for convex polyhedra, Int. Symp. on Static Analysis (SAS'03), 2003.

]. P. Cas01 and . Caspi, Embedded control: From asynchrony to synchrony and back, EMSOFT'01, 2001.

R. [. Cousot and . Cousot, Abstract interpretation, Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of programming languages , POPL '77, pp.238-252, 1977.
DOI : 10.1145/512950.512973

URL : https://hal.archives-ouvertes.fr/inria-00528590

J. [. Cohen and . Collard, Instancewise reaching definition analysis for recursive programs using context-free transductions, Parallel Architectures and Compilation Techniques (PACT'98), pp.332-340, 1998.
URL : https://hal.archives-ouvertes.fr/hal-01257320

J. [. Cohen, M. Collard, and . Griebl, Data-flow analysis of recursive structures, Proc. of the 6 th Workshop on Compilers for Parallel Computers (CPC'96), pp.181-192, 1996.
URL : https://hal.archives-ouvertes.fr/hal-01257322

A. [. Carribault, W. Cohen, and . Jalby, Deep jam: conversion of coarse-grain parallelism to instruction-level and vector parallelism for irregular applications, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05), pp.291-300, 2005.
DOI : 10.1109/PACT.2005.16

URL : https://hal.archives-ouvertes.fr/hal-01257293

A. Duranton, C. Cohen, P. Eisenbeis, D. Feautrier, and . Genius, Ambient Intelligence: Impact on Embedded-System Design, chapter Application Domain-Driven System Design for Pervasive Video Processing, pp.251-270, 2003.

M. A. Cohen, C. Duranton, C. Eisenbeis, F. Pagetti, M. Plateau et al., Synchronization of periodic clocks, Proceedings of the 5th ACM international conference on Embedded software , EMSOFT '05, pp.339-342, 2005.
DOI : 10.1145/1086228.1086289

URL : https://hal.archives-ouvertes.fr/hal-01257295

M. A. Cohen, C. Duranton, C. Eisenbeis, F. Pagetti, M. Plateau et al., N-sychronous Kahn networks, 33 th ACM Symp. on Principles of Programming Languages (PoPL'06), pp.180-193, 2006.
DOI : 10.1145/1111037.1111054

A. Cohen, S. Donadio, M. Garzaran, C. Herrmann, O. Kiselyov et al., In search of a program generator to implement generic transformations for high-performance computing, Science of Computer Programming, vol.62, issue.1, pp.25-46, 2004.
DOI : 10.1016/j.scico.2005.10.013

URL : https://hal.archives-ouvertes.fr/hal-01257287

C. [. Carr, P. Ding, and . Sweany, Improving software pipelining with unroll-and-jam, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences, 1996.
DOI : 10.1109/HICSS.1996.495462

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1.9319

. Cfr-+-91-]-r, J. Cytron, B. K. Ferrante, M. N. Rosen, F. K. Wegman et al., Efficiently computing static single assignment form and the control dependence graph, ACM Trans. on Programming Languages and Systems, vol.13, issue.4, pp.451-490, 1991.

J. Colaço, A. Girault, G. Hamon, and M. Pouzet, Towards a higher-order synchronous data-flow language, Proceedings of the fourth ACM international conference on Embedded software , EMSOFT '04, 2004.
DOI : 10.1145/1017753.1017792

A. Cohen, S. Girbal, D. Parello, M. Sigler, O. Temam et al., Facilitating the search for compositions of program transformations, Proceedings of the 19th annual international conference on Supercomputing , ICS '05, pp.151-160, 2005.
DOI : 10.1145/1088149.1088169

URL : https://hal.archives-ouvertes.fr/hal-01257296

S. [. Cohen, O. Girbal, and . Temam, A Polyhedral Approach to Ease the Composition of Program Transformations, Euro-Par'04, number 3149 in LNCS, pp.292-303, 2004.
DOI : 10.1007/978-3-540-27866-5_38

URL : https://hal.archives-ouvertes.fr/hal-01257301

N. [. Cousot and . Halbwachs, Automatic discovery of linear restraints among variables of a program, Proceedings of the 5th ACM SIGACT-SIGPLAN symposium on Principles of programming languages , POPL '78, pp.84-96, 1978.
DOI : 10.1145/512760.512770

]. D. Cha84 and . Chapiro, Globally-Asynchronous Locally-Synchronous Systems, 1984.

. D. Chh-+-93-]-k, M. W. Cooper, R. T. Hall, K. Hood, K. S. Kennedy et al., The ParaScope parallel programming environment, Proceedings of the IEEE, pp.244-263, 1993.

K. [. Chauhan and . Kennedy, Optimizing strategies for telescoping languages, Proceedings of the 15th international conference on Supercomputing , ICS '01, pp.92-101, 2001.
DOI : 10.1145/377792.377812

C. [. Cormen, R. L. Leiserson, and . Rivest, Introduction to Algorithms, 1989.

M. H. Cintra, J. F. Martínez, and J. Torrellas, Architectural support for scalable speculative parallelization in shared-memory multiprocessors, ACM/IEEE Intl. Symp. on Computer Architecture (ISCA'00), pp.13-24, 2000.

]. A. Coh99 and . Cohen, Program Analysis and Transformation: from the Polytope Model to Formal Languages, 1999.

]. Col95 and . Collard, Automatic parallelization of while-loops using speculative execution, Intl. J. of Parallel Programming, vol.23, issue.2, pp.191-219, 1995.

]. Col02 and . Collard, Reasoning About Program Transformations, 2002.

]. P. Cou81 and . Cousot, Semantic foundations of programs analysis, 1981.

]. P. Cou96 and . Cousot, Program analysis: The abstract interpretation perspective, ACM Computing Surveys, issue.4es, p.28, 1996.

M. [. Caspi and . Pouzet, Synchronous Kahn networks, ICFP '96: Proceedings of the 1 st ACM SIGPLAN Intl. Conf. on Functional programming, pp.226-238, 1996.
DOI : 10.1145/232627.232651

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.15.9168

[. Colaço and M. Pouzet, Clocks as First Class Abstract Types, EMSOFT'03, pp.134-155, 2003.
DOI : 10.1007/978-3-540-45212-6_10

]. B. Cre96 and . Creusillet, Array Region Analyses and Applications, 1996.

P. [. Cooper, D. Schielke, and . Subramanian, Optimizing for reduced code space using genetic algorithms, Proc. Languages, Compilers, and Tools for Embedded Systems (LCTES), pp.1-9, 1999.

C. Calcagno, W. Taha, L. Huang, and X. Leroy, Implementing Multi-stage Languages Using ASTs, Gensym, and Reflection, ACM SIGPLAN/SIGSOFT Intl. Conf. Generative Programming and Component Engineering (GPCE'03), pp.57-76, 2003.
DOI : 10.1007/978-3-540-39815-8_4

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.10.6148

D. [. Crop and . Wilde, Scheduling Structured Systems, EuroPar'99, pp.409-412, 1999.
DOI : 10.1007/3-540-48311-X_53

F. [. Coarfa, N. Zhao, J. Tallent, Y. Mellor-crummey, and . Dotsenko, Open-source compiler technology for source-to-source optimization

S. Donadio, J. Brodman, T. Roeder, K. Yotov, D. Barthou et al., A Language for the Compact Representation of Multiple Program Versions, Languages and Compilers for Parallel Computing (LCPC'05), 2005.
DOI : 10.1007/978-3-540-69330-7_10

URL : https://hal.archives-ouvertes.fr/hal-00141067

]. A. Deu94 and . Deutsch, Interprocedural may-alias analysis for pointers: beyond k-limiting, ACM Symp. on Programming Language Design and Implementation (PLDI'94), pp.230-241, 1994.

G. [. Darte and . Huard, Loop shifting for loop parallelization, International Journal of Parallel Programming, vol.28, issue.5, pp.499-534, 2000.
DOI : 10.1023/A:1007506711786

]. J. Dhw-+-97, J. E. Dean, C. A. Hicks, W. E. Waldspurger, G. Z. Weihl et al., ProfileMe: Hardware support for instruction level profiling on out-of-order processors, Proceedings of the 30 th International Symposium on Microarchitecture, NC, 1997.

E. A. De-kock, G. Essink, W. J. Smits, P. Van-der-wolf, J. Brunel et al., YAPI, Proceedings of the 37th conference on Design automation , DAC '00, 2000.
DOI : 10.1145/337292.337511

D. [. De-rose and . Padua, Techniques for the translation of MATLAB programs into Fortran 90, ACM Transactions on Programming Languages and Systems, vol.21, issue.2, pp.286-323, 1999.
DOI : 10.1145/316686.316693

Y. [. Darte and . Robert, Mapping uniform loop nests onto distributed memory architectures, Parallel Computing, vol.20, issue.5, pp.679-710, 1994.
DOI : 10.1016/0167-8191(94)90001-9

URL : https://hal.archives-ouvertes.fr/hal-00857077

Y. [. Darte, F. Robert, and . Vivien, Scheduling and Automatic Parallelization, Birkhaüser, 2000.
DOI : 10.1007/978-1-4612-1362-8

URL : https://hal.archives-ouvertes.fr/hal-00856645

A. Darte, G. Silber, and F. Vivien, Combining Retiming and Scheduling Techniques for Loop Parallelization and Loop Tiling, Parallel Processing Letters, vol.07, issue.04, pp.379-392, 1997.
DOI : 10.1142/S0129626497000383

URL : https://hal.archives-ouvertes.fr/hal-00856890

E. [. Messerschmitt and . Lee, Static scheduling of synchronous data flow programs for digital signal processing, IEEE Trans. Computers, vol.36, issue.1, pp.24-25, 1987.

. B. Ech-+-92-]-d, J. W. Epstein, D. F. Cannon, S. V. Holt, M. S. Levy et al., Word Processing in Groups, 1992.

J. [. Esparza and . Knoop, An Automata-Theoretic Approach to Interprocedural Data-Flow Analysis, FOSSACS'99, 1999.
DOI : 10.1007/3-540-49019-1_2

J. [. Elgot and . Mezei, On Relations Defined by Generalized Finite Automata, IBM Journal of Research and Development, vol.9, issue.1, pp.45-68, 1965.
DOI : 10.1147/rd.91.0047

A. [. Esparza and . Podelski, Efficient algorithms for pre* and post* on interprocedural parallel flow graphs, Proceedings of the 27th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '00, pp.1-11, 2000.
DOI : 10.1145/325694.325697

A. E. Eichenberger, P. Wu, and K. O-'brien, Vectorization for simd architectures with alignment constraints, ACM Symp. on Programming Language Design and Implementation (PLDI '04), pp.82-93, 2004.

G. Fursin, A. Cohen, M. O. Boyle, and O. Temam, A Practical Method for Quickly Evaluating Program Optimizations, Intl. Conf. on High Performance Embedded Architectures and Compilers (HiPEAC'05), number 3793 in LNCS, pp.29-46, 2005.
DOI : 10.1007/11587514_4

URL : https://hal.archives-ouvertes.fr/inria-00001054

]. P. Fea88a and . Feautrier, Array expansion, ACM Intl. Conf. on Supercomputing, pp.429-441, 1988.

]. P. Fea88b and . Feautrier, Parametric integer programming, RAIRO Recherche Opérationnelle, vol.22, pp.243-268, 1988.

]. P. Fea91 and . Feautrier, Dataflow analysis of scalar and array references, Intl. J. of Parallel Programming, vol.20, issue.1, pp.23-53, 1991.

]. P. Fea92 and . Feautrier, Some efficient solutions to the affine scheduling problem, part II, multidimensional time See also Part I, one dimensional time, Intl. J. of Parallel Programming, vol.21, issue.215, pp.389-420315, 1992.

]. P. Fea98 and . Feautrier, A parallelization framework for recursive tree programs, EuroPar'98, 1998.

]. P. Fea06 and . Feautrier, Scalable and structured scheduling, Intl. J. of Parallel Programming, vol.28, 2006.

M. [. Feautrier, C. Griebl, and . Lengauer, On index set splitting, Parallel Architectures and Compilation Techniques (PACT'99), 1999.

S. [. Frigo and . Johnson, FFTW: an adaptive software architecture for the FFT, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181), pp.1381-1384, 1998.
DOI : 10.1109/ICASSP.1998.681704

D. [. Fradet and . Metayer, Shape types, Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '97, pp.27-39, 1997.
DOI : 10.1145/263699.263706

M. [. Fursin, P. Boyle, and . Knijnenburg, Evaluating Iterative Compilation, 11 th Languages and Compilers for Parallel Computing, 2002.
DOI : 10.1007/11596110_24

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1.4652

J. [. Griebl and . Collard, Generation of synchronous code for automatic parallelization of while loops, EuroPar'95, pp.315-326, 1995.
DOI : 10.1007/BFb0020474

L. [. Ghiya and . Hendren, Is it a tree, a DAG, or a cyclic graph? A shape analysis for heap-directed pointers in C, Proceedings of the 23rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '96, pp.1-15, 1996.
DOI : 10.1145/237721.237724

G. [. Girbal, A. Mouchard, O. Cohen, and . Temam, DiST: A simple, reliable and scalable method to significantly reduce processor architecture simulation time, Intl. Conf. on Measurement and Modeling of Computer Systems, ACM SIGMETRICS'03, 2003.
URL : https://hal.archives-ouvertes.fr/hal-01257307

J. Giavitto, O. Michel, and J. Sansonnet, Group-based fields See also " Design and Implementation of 81/2, a Declarative Data-Parallel Language, RR 1012, Proc. of the Parallel Symbolic Languages and Systems, 1995.

K. Goossens, G. Prakash, J. Röver, and A. P. Niranjan, Interconnect and memory organization in SOCs for advanced set-top boxes and TV ? evolution, analysis, and trends, Interconnect-Centric Design for Advanced SoC and NoC, chapter 15, pp.399-423, 2004.

[. Guillou, F. Quilleré, P. Quinton, S. Rajopadhye, and T. Risset, Hardware design methodology with the Alpha language, FDL'01, 2001.

M. P. Gerlek, E. Stoltz, and M. J. Wolfe, Beyond induction variables: detecting and classifying sequences using a demand-driven SSA form, ACM Transactions on Programming Languages and Systems, vol.17, issue.1, pp.85-122, 1995.
DOI : 10.1145/200994.201003

]. S. Gvb-+-06, N. Girbal, C. Vasilache, A. Bastoul, D. Cohen et al., Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies, Intl. J. of Parallel Programming, vol.34, issue.3, 2006.

]. W. Har89 and . Harrison, The interprocedural analysis and automatic parallelisation of Scheme programs, Lisp and Symbolic Computation, pp.176-396, 1989.

K. Heydeman, F. Bodin, P. M. Knijnenburg, and L. Morin, UFC: a global trade-off strategy for loop unrolling for VLIW architectures, Proc. Compilers for Parallel Computers (CPC), pp.59-70, 2003.

P. [. Halbwachs, P. Caspi, D. Raymond, and . Pilaud, The synchronous data flow programming language LUSTRE, Proceedings of the IEEE, vol.79, issue.9, pp.1305-1320, 1991.
DOI : 10.1109/5.97300

L. J. Hendren, J. Hummel, and A. Nicolau, Abstractions for recursive pointer data structures: improving the analysis and transformation of imperative programs, ACM Symp. on Programming Language Design and Implementation (PLDI'92), pp.249-260, 1992.

T. Harris, S. Marlow, S. P. Jones, and M. Herlihy, Composable memory transactions, ACM Symp. on Principles and Practice of Parallel Programming (PPoPP'05), 2005.

P. [. Irigoin, R. Jouvelot, and . Triolet, Semantical interprocedural parallelization: An overview of the pips project, ACM Intl. Conf. on Supercomputing (ICS'91), 1991.
URL : https://hal.archives-ouvertes.fr/hal-00984684

]. G. Kah74 and . Kahn, The semantics of a simple language for parallel programming, Information processing, pp.471-475, 1974.

]. W. Kel96 and . Kelly, Optimization within a unified transformation framework, 1996.

T. Kisuki, P. Knijnenburg, K. Gallivan, and M. O. Boyle, The effect of cache models on iterative compilation for combined tiling and unrolling, Parallel Architectures and Compilation Techniques (PACT'00, 2001.

T. Kisuki, P. Knijnenburg, M. O. Boyle, and H. Wijshoff, Iterative compilation in program optimization, Proc. CPC'10 (Compilers for Parallel Computers), pp.35-44, 2000.

S. [. Koutsofios and . North, Drawing Graphs With dot, 2002.

W. [. Kelly, E. Pugh, and . Rosser, Code generation for multiple mappings, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation, 1995.
DOI : 10.1109/FMPC.1995.380437

M. [. Klarlund and . Schwartzbach, Graph types, ACM Symp. on Principles of Programming Languages (PoPL'93), pp.196-205, 1993.
DOI : 10.7146/dpb.v21i421.7952

V. [. Knobe and . Sarkar, Array SSA form and its use in parallelization, Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '98, pp.107-120, 1998.
DOI : 10.1145/268946.268956

P. [. Lefebvre and . Feautrier, Automatic storage management for parallel programs, Parallel Computing, vol.24, issue.3-4, pp.649-671, 1998.
DOI : 10.1016/S0167-8191(98)00029-5

G. [. Long and . Fursin, A heuristic search algorithm based on unified transformation framework, 7th Intl. Workshop on High Performance Scientific and Engineering Computing (HPSEC-05), 2005.

M. [. Li, D. Garzaran, and . Padua, A dynamically tuned sorting library, ACM Conf. on Code Generation and Optimization (CGO'04), 2004.

M. [. Lim and . Lam, Communication-free parallelization via affine transformations, 24 th ACM Symp. on Principles of Programming Languages, pp.201-214, 1997.
DOI : 10.1007/BFb0025873

[. Liu, R. Lo, and F. Chow, Loop induction variable canonicalization in parallelizing compilers, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques (PACT '96), p.228, 1996.

S. [. Lim, M. S. Liao, and . Lam, Blocking and array contraction across arbitrarily nested loops using affine partitioning, ACM Symp. on Principles and Practice of Parallel Programming (PPoPP'01), pp.102-112, 2001.

M. [. Long and . Boyle, Adaptive java optimisation using instance-based learning, Proceedings of the 18th annual international conference on Supercomputing , ICS '04, pp.237-246, 2004.
DOI : 10.1145/1006209.1006243

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.58.5381

K. [. Li and . Pingali, A singular loop transformation framework based on non-singular matrices, International Journal of Parallel Programming, vol.16, issue.4, pp.183-205, 1994.
DOI : 10.1007/BF02577874

J. [. Leiserson and . Saxe, Retiming synchronous circuitry, Algorithmica, vol.9, issue.No. 1, 1991.
DOI : 10.1007/BF01759032

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.368.3222

S. [. Lau, B. Schoenmackers, and . Calder, Transition Phase Classification and Prediction, 11th International Symposium on High-Performance Computer Architecture, 2005.
DOI : 10.1109/HPCA.2005.39

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.4905

W. Liu, J. Tuck, L. Ceze, W. Ahn, K. Strauss et al., POSH, Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '06, pp.158-167, 2006.
DOI : 10.1145/1122971.1122997

[. Guernic, J. Talpin, and J. Lann, POLYCHRONY for System Design, Systems and Computers, Special Issue on Application Specific Hardware Design, 2003.
DOI : 10.1142/S0218126603000763

URL : https://hal.archives-ouvertes.fr/hal-00730480

D. [. Loechner and . Wilde, Parameterized polyhedra and their vertices, Intl. J. of Parallel Programming, vol.25, issue.6, 1997.
URL : https://hal.archives-ouvertes.fr/inria-00534851

S. [. Maydan, M. S. Amarasinghe, and . Lam, Array dataflow analysis and its use in array privatization, 20 th ACM Symp. on Principles of Programming Languages, pp.2-15, 1993.

]. Z. Man74 and . Manna, Mathematical Theory of Computation, 1974.

A. Monsifrot, F. Bodin, and R. Quiniou, A Machine Learning Approach to Automatic Production of Compiler Heuristics, Proc. AIMSA, number 2443 in LNCS, pp.41-50, 2002.
DOI : 10.1007/3-540-46148-5_5

]. A. Mbvm04, M. Moonen, J. Bekooij, and . Van-meerbergen, Timing analysis model for network based multiprocessor systems, Proc. of ProRISC, 15th annual Workshop of Circuits, System and Signal Processing, pp.91-99, 2004.

]. A. Mcc-+-06, J. Mcdonald, B. Chung, C. Carlstrom, H. Minh et al., Architectural semantics for practical transactional memory, ACM/IEEE Intl. Symp. on Computer Architecture (ISCA'06), 2006.

]. S. Muc97 and . Muchnick, Advanced Compiler Design & Implementation, 1997.

]. D. Nai04 and . Naishlos, Autovectorization in GCC, Proceedings of the 2004 GCC Developers Summit, pp.105-118, 2004.

H. [. Nielson, C. Nielson, and . Hankin, Principles of Program Analysis, 1999.
DOI : 10.1007/978-3-662-03811-6

. [. Boyle, MARS: a distributed memory approach to shared memory compilation, Proc. Language , Compilers and Runtime Systems for Scalable Computing, 1998.

J. T. Oplinger, D. L. Heine, and M. S. Lam, In search of speculative thread-level parallelism, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425), 1999.
DOI : 10.1109/PACT.1999.807576

]. C. Oka96 and . Okasaki, Functional data structures Advanced Functional Programming, pp.131-158, 1996.

P. [. Boyle, G. Knijnenburg, and . Fursin, Feedback assisted iterative compiplation, Proc. LCR, 2000.

]. R. Par66 and . Parikh, On context-free languages, J. of the ACM, vol.13, issue.4, pp.570-581, 1966.

L. Pouchet, C. Bastoul, A. Cohen, and N. Vasilache, Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time, International Symposium on Code Generation and Optimization (CGO'07), 2007.
DOI : 10.1109/CGO.2007.21

URL : https://hal.archives-ouvertes.fr/hal-01257281

]. S. Pcb-+-06, A. Pop, C. Cohen, S. Bastoul, G. Girbal et al., Graphite: Loop optimizations based on the polyhedral model for GCC, Proc. of the 4 th GCC Developper's Summit, 2006.

A. [. Pop, G. Cohen, and . Silber, Induction Variable Analysis with Delayed Abstractions, Intl. Conf. on High Performance Embedded Architectures and Compilers (HiPEAC'05), number 3793 in LNCS, pp.218-232, 2005.
DOI : 10.1007/11587514_15

URL : https://hal.archives-ouvertes.fr/hal-01257294

[. Perrin and A. Darte, The Data Parallel Programming Model, Number 1132 in LNCS, 1996.
DOI : 10.1007/3-540-61736-1

]. B. Pie02 and . Pierce, Types and Programming Languages, 2002.

A. [. Phansalkar, L. Joshi, L. Eeckhout, and . John, Four generations of SPEC CPU benchmarks: what has changed and what has not, 2004.

]. S. Pop06 and . Pop, The SSA Representation Framework: Semantics, Analyses and GCC Implementation, 2006.

]. F. Pot96 and . Pottier, Simplifying subtyping constraints, ACM Intl. Conf. on Functional Programming (ICFP'96), pp.122-133, 1996.

J. [. Pelletier and . Sakarovitch, On the representation of finite deterministic 2-tape automata, Theoretical Computer Science, vol.225, issue.1-2, pp.1-63, 1999.
DOI : 10.1016/S0304-3975(98)00179-0

M. Püschel, B. Singer, J. Xiong, J. Moura, J. Johnson et al., Spiral: A Generator for Platform-Adapted Libraries of Signal Processing Algorithms, Journal of High Performance Computing and Applications, special issue on Automatic Performance Tuning, pp.21-45, 2004.
DOI : 10.1177/1094342004041291

O. [. Parello, A. Temam, J. Cohen, and . Verdun, Towards a Systematic, Pragmatic and Architecture-Aware Program Optimization Process for Complex Processors, Proceedings of the ACM/IEEE SC2004 Conference, 2004.
DOI : 10.1109/SC.2004.61

URL : https://hal.archives-ouvertes.fr/hal-01257302

O. [. Parello, J. Temam, and . Verdun, On Increasing Architecture Awareness in Program Optimizations to Bridge the Gap between Peak and Sustained Processor Performance - Matrix-Multiply Revisited, ACM/IEEE SC 2002 Conference (SC'02), 2002.
DOI : 10.1109/SC.2002.10054

]. W. Pug91a and . Pugh, The omega test: a fast and practical integer programming algorithm for dependence analysis, ACM/IEEE Conf. on Supercomputing, pp.4-13, 1991.

W. Pugh, The Omega test: a fast and practical integer programming algorithm for dependence analysis, Proceedings of the 1991 ACM/IEEE conference on Supercomputing , Supercomputing '91, pp.4-13, 1991.
DOI : 10.1145/125826.125848

]. W. Pug91c and . Pugh, Uniform techniques for loop optimization, ACM Intl. Conf. on Supercomputing (ICS'91), pp.341-352, 1991.

]. W. Pug92 and . Pugh, A practical algorithm for exact array dependence analysis, Communications of the ACM, vol.35, issue.8, pp.27-47, 1992.

S. [. Quilleré and . Rajopadhye, Optimizing memory usage in the polyhedral model, Institut de Recherche en Informatique et Systèmes Aléatoires, 1999.

F. Quilleré, S. Rajopadhye, and D. Wilde, Generation of efficient nested loops from polyhedra, International Journal of Parallel Programming, vol.28, issue.5, pp.469-498, 2000.
DOI : 10.1023/A:1007554627716

S. [. Reps, M. Horwitz, and . Sagiv, Precise interprocedural dataflow analysis via graph reachability, Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '95, 1995.
DOI : 10.1145/199448.199462

D. [. Rauchwerger and . Padua, The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization, IEEE Transactions on Parallel and Distributed Systems, vol.10, issue.2, pp.160-180, 1999.
DOI : 10.1109/71.752782

L. [. Rus, J. Rauchwerger, and . Hoeflinger, Hybrid analysis, Proceedings of the 16th international conference on Supercomputing , ICS '02, pp.251-283, 2003.
DOI : 10.1145/514191.514229

D. [. Rus, L. Zhang, and . Rauchwerger, The value evolution graph and its use in memory reference analysis, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004., 2004.
DOI : 10.1109/PACT.2004.1342558

S. [. Stephenson and . Amarasinghe, Predicting Unroll Factors Using Supervised Classification, International Symposium on Code Generation and Optimization, 2005.
DOI : 10.1109/CGO.2005.29

M. Stephenson, S. P. Amarasinghe, M. C. Martin, and U. Reilly, Meta optimization: improving compiler heuristics with machine learning, ACM Symp. on Programming Language Design and Implementation (PLDI'03), pp.77-90, 2003.

]. R. Sar-+-00, S. Schreiber, B. Aditya, V. Rau, S. Kathail et al., High-level synthesis of nonprogrammable hardware accelerators, 2000.

M. M. Strout, L. Carter, J. Ferrante, and B. Simon, Schedule-independant storage mapping for loops, ACM Symp. on Architectural Support for Programming Languages and Operating Systems (ASPLOS'98), 1998.

]. M. Smi00 and . Smith, Overcoming the challenges to feedback-directed optimization, ACM SIGPLAN Workshop on Dynamic and Adaptive Compilation and Optimization, pp.1-11, 2000.

A. [. Sharir and . Pnueli, Program Flow Analysis: Theory and Applications, chapter Two Approaches to Interprocedural Data Flow Analysis, 1981.

T. Sherwood, E. Perelman, G. Hamerly, and B. Calder, Automatically characterizing large scale program behavior, 10th International Conference on Architectural Support for Programming Languages and Operating Systems, 2002.
DOI : 10.1145/605397.605403

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.118.6150

T. [. Sagiv, R. Reps, and . Wilhelm, Parametric shape analysis via 3-valued logic, ACM Symp. on Principles of Programming Languages (PoPL'99), pp.105-118, 1999.
DOI : 10.1145/514188.514190

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.3161

P. [. Triolet, P. Feautrier, and . Jouvelot, Automatic parallelization of fortran programs in the presence of procedure calls, Proc. of the 1 st European Symp. on Programming (ESOP'86), number 213 in LNCS, pp.210-222, 1986.
DOI : 10.1007/3-540-16442-1_16

W. Thies, M. Karczmarek, and S. Amarasinghe, StreamIt: A Language for Streaming Applications, Intl. Conf. on Compiler Construction, 2002.
DOI : 10.1007/3-540-45937-5_14

D. [. Tu and . Padua, Automatic array privatization, 6 th Languages and Compilers for Parallel Computing, number 768 in LNCS, pp.500-521, 1993.
DOI : 10.1007/3-540-45403-9_8

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.3.5746

[. Tremblay and P. Sorenson, The theory and practice of compiler writing, 1985.

M. [. Triantafyllis and D. I. Vachharajani, Compiler optimization-space exploration, International Symposium on Code Generation and Optimization, 2003. CGO 2003., 2005.
DOI : 10.1109/CGO.2003.1191546

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.131.1622

W. Thies, F. Vivien, J. Sheldon, and S. Amarasinghe, A unified framework for schedule and storage optimization, ACM Symp. on Programming Language Design and Implementation (PLDI'01), pp.232-242, 2001.
URL : https://hal.archives-ouvertes.fr/hal-00808285

X. Vera, J. Abella, A. González, and J. Llosa, Optimizing program locality through CMEs and GAs, Oceans 2002 Conference and Exhibition. Conference Proceedings (Cat. No.02CH37362), pp.68-78, 2003.
DOI : 10.1109/PACT.2003.1238003

N. Vasilache, C. Bastoul, and A. Cohen, Polyhedral Code Generation in the Real World, Proceedings of the International Conference on Compiler Construction (ETAPS CC'06), pp.185-201, 2006.
DOI : 10.1007/11688839_16

URL : https://hal.archives-ouvertes.fr/inria-00001106

S. Verdoolaege, M. Bruynooghe, G. Janssens, and F. Catthoor, Multi-dimentsional incremetal loops fusion for data locality, ASAP, pp.17-27, 2003.

N. Vasilache, A. Cohen, C. Bastoul, and S. Girbal, Violated dependence analysis, Proceedings of the 20th annual international conference on Supercomputing , ICS '06, 2006.
DOI : 10.1145/1183401.1183448

URL : https://hal.archives-ouvertes.fr/hal-01257290

R. A. Van-engelen, Efficient Symbolic Analysis for Optimizing Compilers, Proceedings of the International Conference on Compiler Construction (ETAPS CC'01), pp.118-132, 2001.
DOI : 10.1007/3-540-45306-7_9

D. [. Veldhuizen and . Gannon, Active libraries: Rethinking the roles of compilers and libraries, SIAM Workshop on Object Oriented Methods for Inter-operable Scientific and Engineering Computing, 1998.

]. E. Vis01 and . Visser, Stratego: A language for program transformation based on rewriting strategies. System description of Stratego 0.5, Rewriting Techniques and Applications (RTA'01), volume 2051 of Lecture Notes in Computer Science, pp.357-361, 2001.

]. J. Vui94 and . Vuillemin, On circuits and numbers, IEEE Trans. on Computers, vol.43, issue.8, pp.868-879, 1994.

P. Wu, A. Cohen, D. Padua, and J. Hoeflinger, Monotonic evolution, Proceedings of the 15th international conference on Supercomputing , ICS '01, 2001.
DOI : 10.1145/377792.377809

URL : https://hal.archives-ouvertes.fr/hal-01257312

]. M. Wol92 and . Wolf, Improving Locality and Parallelism in Nested Loops, 1992.

]. M. Wol96 and . Wolfe, High Performance Compilers for Parallel Computing, 1996.

]. D. Won95 and . Wonnacott, Constraint-Based Array Dependence Analysis, 1995.

R. C. Whaley, A. Petitet, and J. J. Dongarra, Automated empirical optimizations of software and the atlas project, Parallel Computing, 2000.

]. J. Xue94 and . Xue, Automating non-unimodular loop transformations for massive parallelism, Parallel Computing, vol.20, issue.5, pp.711-728, 1994.

K. Yotov, X. Li, G. Ren, M. Cibulskis, G. Dejong et al., A comparison of empirical and model-driven optimization, ACM Symp. on Programming Language Design and Implementation (PLDI'03), 2003.