M. Barreteau, F. Bodin, P. Brinkhaus, Z. Chamski, H. Charles et al., Oceans: Optimizing compilers for embedded applications, Euro-Par Conference, Lect. Notes in Computer Science, p.74, 1997.

F. Agner, Optimization manuals

T. Alexander, Performance Prediction for Loop Restructuring Optimization, 1993.

C. Alias, Program Optimization by Template Recognition and Replacement, p.25, 2005.

C. Alias, Tema: an efficient tool to find high-performance library patterns in source code, Workshop on Patterns in High Performance Computing, p.31, 1929.

C. Alias and D. Barthou, Algorithm recognition based on demand-driven data-flow analysis, IEEE Working Conf. on Reverse Engineering, pp.296-305, 2003.

C. Alias and D. Barthou, Deciding Where to Call Performance Libraries, Euro-Par Conference, pp.336-345, 2005.
DOI : 10.1007/11549468_39

URL : https://hal.archives-ouvertes.fr/hal-00141074

C. Alias and D. Barthou, On Domain Specific Languages Re-Engineering, In ACM Int. Conf. on Generative Programming and Component Engineering Lect. Notes in Computer Science, vol.3676, issue.22, pp.63-77, 2005.

V. H. Allan, R. B. Jones, R. M. Lee, and S. J. Allan, Software pipelining, ACM Computing Surveys, vol.27, issue.3, pp.367-432, 1995.
DOI : 10.1145/212094.212131

K. Asanovic, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands et al., The Landscape of Parallel Computing Research: A View from Berkeley, p.80, 2006.

V. Bala, E. Duesterwald, and S. Banerjia, Dynamo, ACM SIGPLAN Notices, vol.35, issue.5, pp.1-12, 2000.
DOI : 10.1145/358438.349303

D. Barthou, Analyse du flot des données pour tableaux en présence de contraintes non affines, p.80, 1998.

D. Barthou, C. Bastoul, F. Bodin, A. Cohen, L. Djoudi et al., Rapport d'analyse de performances et restructuration d'algorithmes de la chromodynamique quantique sur réseau (lqcd), p.47, 2007.

D. Barthou and P. Carribault, Optimization report for ci-1 and ci-2 codes, 1946.

D. Barthou, A. Cohen, and J. Collard, Maximal static expansion, Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '98, pp.98-106, 1998.
DOI : 10.1145/268946.268955

URL : https://hal.archives-ouvertes.fr/hal-01257319

D. Barthou, S. Donadio, A. Duchateau, P. Carribault, and W. Jalby, Loop optimization using adaptive compilation and kernel decomposition, In ACM/IEEE Int. Symp. on Code Optimization and Generation, pp.170-184, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00141056

D. Barthou, P. Feautrier, and X. Redon, On the Equivalence of Two Systems of Affine Recurrence Equations, Euro-Par Conference, pp.309-313, 2002.
DOI : 10.1007/3-540-45706-2_40

URL : https://hal.archives-ouvertes.fr/inria-00072302

C. Bastoul, A. Cohen, S. Girbal, S. Sharma, and O. Temam, Putting Polyhedral Loop Transformations to Work, Workshop on Languages and Compilers for Parallel Computing (LCPC'03), pp.23-30, 2003.
DOI : 10.1007/978-3-540-24644-2_14

URL : https://hal.archives-ouvertes.fr/inria-00071681

J. Bilmes, K. Asanovic, C. Chin, and J. Demmel, Optimizing Matrix Multiply using PHiPAC: A Portable, High-Performance, ANSI C Coding Methodology, ACM Int. Conf. on Supercomputing, p.78, 1997.

F. Bodin, T. Kisuk, P. M. Knijnenburg, M. F. O-'boyle, and E. Rohou, Iterative compilation in a non-linear optimisation space, Workshop on Profile and Feed-back directed Compilation, p.78, 1998.
URL : https://hal.archives-ouvertes.fr/inria-00475919

B. Boigelot and P. Wolper, Symbolic verification with periodic sets, Int. Conf. on Computer-Aided Verification, pp.55-67, 1994.
DOI : 10.1007/3-540-58179-0_43

E. L. Boyd, G. A. Abandah, H. Lee, and E. S. Davidson, Modeling computation and communication performance of parallel scientific applications: A case study of the ibm sp2, ACM Int. Conf. on Supercomputing, p.56, 1995.

S. Browne, J. Dongarra, N. Garner, G. Ho, and P. Mucci, A Portable Programming Interface for Performance Evaluation on Modern Processors, International Journal of High Performance Computing Applications, vol.14, issue.3, pp.189-204, 2000.
DOI : 10.1177/109434200001400303

B. R. Buck and J. K. Hollingsworth, An API for Runtime Code Patching, International Journal of High Performance Computing Applications, vol.14, issue.4, pp.317-329, 1994.
DOI : 10.1177/109434200001400404

C. Ca?caval and D. A. Padua, Estimating cache misses and locality using stack distances, Proceedings of the 17th annual international conference on Supercomputing , ICS '03, pp.150-159, 2003.
DOI : 10.1145/782814.782836

B. Calder, P. Feller, and A. Eustace, Value profiling, Proceedings of 30th Annual International Symposium on Microarchitecture, pp.259-269, 1997.
DOI : 10.1109/MICRO.1997.645816

D. Callahan, J. Cocke, and K. Kennedy, Estimating interlock and improving balance for pipelined architectures, Journal of Parallel and Distributed Computing, vol.5, issue.4, pp.334-358, 1988.
DOI : 10.1016/0743-7315(88)90002-0

P. Carribault, A. Cohen, and W. Jalby, Deep jam: conversion of coarse-grain parallelism to instruction-level and vector parallelism for irregular applications, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05), p.80, 2005.
DOI : 10.1109/PACT.2005.16

URL : https://hal.archives-ouvertes.fr/hal-01257293

C. Chen, J. Chame, and M. W. Hall, Combining Models and Guided Empirical Search to Optimize for Multiple Levels of the Memory Hierarchy, International Symposium on Code Generation and Optimization, pp.111-122, 2005.
DOI : 10.1109/CGO.2005.10

A. Cimetile, A. D. Lucia, and M. Munro, A Specification Driven Slicing Process for Identifying Reusable Functions, Journal of Software Maintenance: Research and Practice, vol.8, issue.3, pp.145-178, 1996.
DOI : 10.1002/(SICI)1096-908X(199605)8:3<145::AID-SMR127>3.0.CO;2-9

P. Clauss, Counting solutions to linear and nonlinear constraints through Ehrhart polynomials: Applications to analyze and transform scientific programs, ACM Int. Conf. on Supercomputing, pp.278-295, 1996.
URL : https://hal.archives-ouvertes.fr/hal-01100306

C. Click and K. D. Cooper, Combining analyses, combining optimizations, ACM Transactions on Programming Languages and Systems, vol.17, issue.2, pp.181-196, 1995.
DOI : 10.1145/201059.201061

A. Cohen, S. Girbal, and O. Temam, A Polyhedral Approach to Ease the Composition of Program Transformations, Euro-Par Conference, 2004.
DOI : 10.1007/978-3-540-27866-5_38

URL : https://hal.archives-ouvertes.fr/hal-01257301

S. Coleman and K. S. Mckinley, Tile size selection using cache organization and data layout, ACM SIGPLAN Conf. on Programming Language Design and Implementation, pp.279-290, 1995.

C. Consel and O. Danvy, Tutorial notes on partial evaluation, Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '93, pp.493-501, 1993.
DOI : 10.1145/158511.158707

C. Consel, L. Hornof, F. Noël, J. Noyé, and N. Volanschi, A uniform approach for compile-time and run-time specialization, In Partial Evaluation. International Seminar, pp.54-72, 1996.
DOI : 10.1007/3-540-61580-6_4

URL : https://hal.archives-ouvertes.fr/inria-00073917

C. Consel, L. Hornof, R. Marlet, G. Muller, S. Thibault et al., Tempo: specializing systems applications and beyond, ACM Computing Surveys, vol.30, issue.3es, p.72, 1998.
DOI : 10.1145/289121.289140

K. D. Cooper, A. Dasgupta, and K. Kennedy, Vizer: A system to vectorize intel x86 binaries, Intl. Symp. on Computer Architecture, p.64, 2002.

K. D. Cooper, D. Subramanian, and L. Torczon, Adaptive Optimizing Compilers for the 21st Century, The Journal of Supercomputing, vol.23, issue.1, pp.7-22, 2002.
DOI : 10.1023/A:1015729001611

K. D. Cooper and T. Waterman, Investigating Adaptive Compilation Using the Mipspro Compiler, Los Alamos Computer Science Institute Symp, p.49, 2003.
DOI : 10.1177/1094342005056142

P. Cousot and N. Halbwachs, Automatic discovery of linear restraints among variables of a program, Proceedings of the 5th ACM SIGACT-SIGPLAN symposium on Principles of programming languages , POPL '78, pp.84-97, 1978.
DOI : 10.1145/512760.512770

B. Creusillet and F. Irigoin, Exact versus approximate array region analyses, Int. Workshop on Languages and Compilers for Parallel Computing, pp.86-100, 1996.
DOI : 10.1007/BFb0017247

L. Djoudi, J. Acquaviva, and D. Barthou, Compositional Approach Applied to Loop Specialization, Euro-Par Conference, pp.268-279, 2007.
DOI : 10.1007/978-3-540-74466-5_30

URL : https://hal.archives-ouvertes.fr/hal-00575934

L. Djoudi, D. Barthou, P. Carribault, C. Lemuet, J. Acquaviva et al., Exploring application performance: a new tool for a static/dynamic approach, Los Alamos Computer Science Institute Symp, 1957.
URL : https://hal.archives-ouvertes.fr/hal-00141071

L. Djoudi, D. Barthou, P. Carribault, C. Lemuet, J. Acquaviva et al., Exploring application performance: a new tool for a static/dynamic approach, Proceedings of the 6th LACSI Symposium, p.68, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00141071

L. Djoudi, D. Barthou, O. Tomaz, A. Charif-rubial, J. Acquaviva et al., The design and architecture of maqaoprofile: an instrumentation maqao module, Workshop on Explicitly Parallel Instruction Computing Techniques, 1957.

S. Donadio, Optimisation itrative de bibliothque de calculs par division hirarchique de codes, p.33, 2007.

S. Donadio, J. Brodman, T. Roeder, K. Yotov, D. Barthou et al., A Language for the Compact Representation of Multiple Program Versions, Int. Workshop on Languages and Compilers for Parallel Computing, p.35, 2005.
DOI : 10.1007/978-3-540-69330-7_10

URL : https://hal.archives-ouvertes.fr/hal-00141067

J. E. Doner, Decidability of the weak second-order theory of two successors. Notices Amer, Math. Soc, vol.12, pp.365-468, 1922.

J. E. Doner, Tree acceptors and some of their applications, Journal of Computer and System Sciences, vol.4, issue.5, pp.406-451, 1970.
DOI : 10.1016/S0022-0000(70)80041-1

G. Doshi, R. Krishnaiyer, and K. Muthukumar, Optimizing software data prefetches with rotating registers, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques, p.72, 2001.
DOI : 10.1109/PACT.2001.953306

E. Eckstein, O. Knig, and B. Scholz, Code Instruction Selection Based on SSA-Graphs, Int. Workshop Software and Compilers for Embedded Systems, p.26, 2003.
DOI : 10.1007/978-3-540-39920-9_5

L. Eeckhout, H. Vandierendonck, and K. D. Bosschere, Quantifying the Impact of Input Data Sets on Program Behavior and its Applications, J. of Instruction-Level Parallelism, vol.5, 2003.

D. R. Engler and T. A. Proebsting, DCG: An efficient, retargetable dynamic code generation system, Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, pp.263-272, 1994.

M. A. Ertl, Optimal code selection in DAGs, Proceedings of the 26th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '99, p.26, 1999.
DOI : 10.1145/292540.292562

P. Feautrier, Parametric integer programming, RAIRO - Operations Research, vol.22, issue.3, pp.243-268, 1988.
DOI : 10.1051/ro/1988220302431

P. Feautrier, Dataflow analysis of array and scalar references, International Journal of Parallel Programming, vol.24, issue.4, pp.23-53, 1991.
DOI : 10.1007/BF01407931

M. Fowler, Refactoring: Improving the Design of Existing Code, p.20, 1999.
DOI : 10.1007/3-540-45672-4_31

B. B. Fraguela, R. Doallo, and E. L. Zapata, Automatic analytical modeling for the estimation of cache misses, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425), p.49, 1999.
DOI : 10.1109/PACT.1999.807544

S. M. Freudenberger and J. C. Ruttenberg, Phase Ordering of Register Allocation and Instruction Scheduling, Code Generation ? Concepts, Tools, Techniques. Proceedings of the International Workshop on Code Generation, pp.146-172, 1992.
DOI : 10.1007/978-1-4471-3501-2_9

M. Frigo, A Fast Fourier Transform Compiler, ACM SIGPLAN Conf. on Programming Language Design and Implementation, p.12, 1999.

M. Frigo and S. G. Johnson, FFTW: an adaptive software architecture for the FFT, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181), pp.1381-1384, 1998.
DOI : 10.1109/ICASSP.1998.681704

K. Fürlinger and M. Gerndt, Peridot: Towards Automated Runtime Detection of Performance Bottlenecks, High Performance Computing in Science and Engineering, pp.193-202, 2005.
DOI : 10.1007/3-540-28555-5_17

G. Fursin, A. Cohen, M. O. Boyle, and O. Temam, Quick and practical run-time evaluation of multiple program optimizations. Transactions on High-Performance Embedded Architectures and Compilers, pp.13-31, 2006.
URL : https://hal.archives-ouvertes.fr/inria-00084110

G. Gautam and S. Rajopadhye, Simplifying reductions, In ACM Symp. on Principles of Programming Languages, pp.30-41, 2006.

G. Gautam and S. Rajopadhye, The z-polyhedral model, ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, pp.237-248, 2007.

M. Gordon, Exploiting coarse-grained task, data, and pipeline parallelism in stream programs, ASPLOS, 2006.

K. Goto and R. Van-de-geijn, On reducing tlb misses in matrix multiplication, p.49, 2002.

B. Grant, M. Mock, M. Philipose, C. Chambers, and S. J. Eggers, Annotation-Directed Run-Time Specialization in C, Proceedings of the ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation (PEPM'97), pp.163-178, 1997.

B. Grant, M. Mock, M. Philipose, C. Chambers, and S. J. Eggers, DyC: an expressive annotation-directed dynamic compiler for C, Theoretical Computer Science, vol.248, issue.1-2, p.73, 1999.
DOI : 10.1016/S0304-3975(00)00051-7

M. Griebl, P. Feautrier, and C. Lengauer, On index set splitting, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425), pp.607-631, 2000.
DOI : 10.1109/PACT.1999.807572

G. Hager, T. Zeiser, J. Treibig, and G. Wellein, Optimizing performance on modern hpc systems: learning from simple kernel benchmarks. Computational Science and High Performance Computing, pp.273-287, 2006.

N. Halbwachs, P. Caspi, D. Pilaud, and J. A. Plaice, Lustre / A declarative language for programming synchronous systems, ACM Symp. on Principles of Programming Languages, vol.215, issue.7, pp.178-188, 1967.

J. Henning, SPEC CPU2000: measuring CPU performance in the New Millennium, Computer, vol.33, issue.7, pp.28-35, 2000.
DOI : 10.1109/2.869367

G. Huet, A unification algorithm for typed ??-calculus, Theoretical Computer Science, vol.1, issue.1, pp.27-57, 1975.
DOI : 10.1016/0304-3975(75)90011-0

R. Hundt, Hp caliper: An architecture for performance analysis tools, Workshop on Industrial Experiences with Systems Software, 2000.

R. Ierusalimschy, L. H. De-figueiredo, and W. C. Filho, Lua ? an extensible extension language. Software: Pratice and Experience, pp.635-652, 1996.
DOI : 10.1002/(sici)1097-024x(199606)26:6<635::aid-spe26>3.0.co;2-p

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.2941

J. Jaeger and D. Barthou, Rapport d'analyse de performances et restructuration pour une application de modlisation molculaire, p.47, 2007.

R. Jain, The Art of Computer Systems Performance Analysis : Techniques for Experimental Design, Measurement, Simulation, and Modeling, 1991.

W. Jalby, C. Lemuet, and X. L. Pasteur, WBTK: a New Set of Microbenchmarks to Explore Memory System Performance for Scientific Computing, International Journal of High Performance Computing Applications, vol.18, issue.2, pp.211-224, 2004.
DOI : 10.1177/1094342004038945

W. Kelly, V. Maslov, W. Pugh, E. Rosser, T. Shpeisman et al., The Omega Library Interface Guide, p.42, 1996.

K. Kennedy, Telescoping languages: a compiler strategy for implementation of high-level domain-specific programming systems, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000, pp.297-304, 2000.
DOI : 10.1109/IPDPS.2000.845999

K. Kennedy, N. Mcintosh, and K. Mckinley, Static Performance Estimation in a Parallelizing Compiler, 1992.

M. A. Khan, H. Charles, and D. Barthou, An Effective Automated Approach to Specialization of Code, Int. Workshop on Languages and Compilers for Parallel Computing, p.66, 2007.
DOI : 10.1007/978-3-540-85261-2_21

M. A. Khan, H. Charles, and D. Barthou, Improving performance through low-overhead specialization of code, Int. Workshop on Iteraction between Compilers and Computer Architectures, p.66, 2007.

S. Kim and J. H. Kim, A hybrid approach for program understanding based on graph parsing and expectation-driven analysis, Applied Artificial Intelligence, vol.12, issue.6, pp.521-546, 1998.
DOI : 10.1080/088395198117659

T. Kisuki, P. Knijnenburg, M. O. Boyle, and H. Wijsho, Iterative compilation in program optimization, CPC, pp.35-44, 2000.

I. Kodukula, N. Ahmed, and K. Pingali, Data-centric Multi-level Blocking, ACM SIGPLAN Conf. on Programming Language Design and Implementation, p.37, 1997.

I. Kodukula and K. Pingali, Transformations for imperfectly nested loops, Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM) , Supercomputing '96, p.37, 1996.
DOI : 10.1145/369028.369051

L. Almagor, K. D. Cooper, A. Grosul, T. J. Harvey, S. W. Reeves et al., Finding effective compilation sequences, Conf. on Languages, Compilers, and Tools for Embedded Systems, 2004.

J. R. Larus, Whole program paths, ACM SIGPLAN Conf. on Programming Language Design and Implementation, pp.259-269, 1999.

J. R. Larus and E. Schnaar, Eel: Machine-independent executable editing, ACM SIGPLAN Conf. on Programming Language Design and Implementation, p.71, 1995.

C. Lattner and V. Adve, LLVM: A compilation framework for lifelong program analysis & transformation, International Symposium on Code Generation and Optimization, 2004. CGO 2004., p.31, 2004.
DOI : 10.1109/CGO.2004.1281665

V. Lefebvre and P. Feautrier, Automatic storage management for parallel programs, Parallel Computing, vol.24, issue.3-4, pp.649-671, 1938.
DOI : 10.1016/S0167-8191(98)00029-5

M. Leone and R. K. Dybvig, Dynamo : A staged compiler architecture for dynamic program optimization, p.73, 1997.

M. Leone and P. Lee, Optimizing ml with run-time code generation, p.66, 1995.

M. Leone and P. Lee, A Declarative Approach to Run-Time Code Generation, Workshop on Compiler Support for System Software (WCSSS), p.73, 1996.

M. Leone and P. Lee, Dynamic specialization in the Fabius system, ACM Computing Surveys, vol.30, issue.3es, p.73, 1998.
DOI : 10.1145/289121.289144

X. Li, M. Garzaran, and D. Padua, A dynamically tuned sorting library, In ACM/IEEE Int. Symp. on Code Optimization and Generation, p.78, 2004.

J. Lu, H. Chen, P. Yew, and W. Hsu, Design and Implementation of a Lightweight Dynamic Optimization System, Journal of Instruction-Level Parallelism, vol.6, p.73, 2004.

H. Makholm, Specializing c-an introduction to the principles behind c-mix, p.73, 1999.

W. Mangione-smith, T. Shih, S. Abraham, and E. Davidson, Approaching a machine-application bound in delivered performance on scientifique code, Proc. of the IEEE, pp.1166-1178, 1993.

B. D. Martino and G. Iannello, PAP Recognizer: a tool for automatic recognition of parallelizable patterns, WPC '96. 4th Workshop on Program Comprehension, pp.164-174, 1996.
DOI : 10.1109/WPC.1996.501131

Y. Matiyasevich, Elimination of quantifiers from arithmetical formulas defining recursively enumerable sets, Mathematics and Computers in Simulation, vol.67, issue.1-2, pp.125-133, 2004.
DOI : 10.1016/j.matcom.2004.05.013

D. Maydan, S. Amarasinghe, and M. Lam, Array dataflow analysis and its use in array privatization, In ACM Symp. on Principles of Programming Languages, pp.2-15, 1924.

J. Mellor-crummey, R. Fowler, and G. Marin, Hpcview: A tool for top-down analysis of node performance, Los Alamos Computer Science Institute Symp, p.77, 2001.

S. Meloan, The java hotspot performance engine: An in-depth look, p.66, 1999.

M. Merten and M. Thiems, An overview of the impact x86 binary reoptimization framework, p.64, 1998.

R. Metzger and Z. Wen, Automatic Algorithm Recognition: A New Approach to Program Optimization, p.29, 2000.

T. Mitchell, Machine Learning, p.78, 1997.

N. Mukherjee, G. Riley, and J. Gurd, FINESSE: a prototype feedback-guided performance enhancement system, Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing, p.71, 2000.
DOI : 10.1109/EMPDP.2000.823400

F. Noël, L. Hornof, C. Consel, and J. L. , Automatic, template-based run-time specialization: implementation and experimental study, Proceedings of the 1998 International Conference on Computer Languages (Cat. No.98CB36225), p.72, 1998.
DOI : 10.1109/ICCL.1998.674164

H. Patil, R. Cohn, M. Charney, R. Kapoor, A. Sun et al., Pinpointing Representative Portions of Large Intel?? Itanium?? Programs with Dynamic Instrumentation, 37th International Symposium on Microarchitecture (MICRO-37'04), p.71, 2004.
DOI : 10.1109/MICRO.2004.28

S. Paul and A. Prakash, A framework for source code search using program patterns, IEEE Transactions on Software Engineering, vol.20, issue.6, pp.463-475, 1928.
DOI : 10.1109/32.295894

I. Piumarta, Ccg: Dynamic code generation for c and c++, INRIA Rocquencourt, vol.66, p.73, 2003.

G. Pokam and F. Bodin, An Offline Approach for Whole-Program Paths Analysis Using Suffix Arrays, Int. Workshop on Languages and Compilers for Parallel Computing, pp.363-378, 2004.
DOI : 10.1007/11532378_26

M. Poletto, W. C. Hsieh, D. R. Engler, and F. M. Kaashoek, C and tcc: a language and compiler for dynamic code generation, ACM Transactions on Programming Languages and Systems, vol.21, issue.2, pp.324-369, 1999.
DOI : 10.1145/316686.316697

S. Pop, GRAPHITE: Loop optimizations based on polyhedral model for gcc, GCC Developer's Summit, p.80, 2006.
URL : https://hal.archives-ouvertes.fr/hal-01257284

L. Pouchet, C. Bastoul, A. Cohen, and N. Vasilache, Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time, International Symposium on Code Generation and Optimization (CGO'07), pp.144-156, 2007.
DOI : 10.1109/CGO.2007.21

URL : https://hal.archives-ouvertes.fr/hal-01257281

W. Pugh, The Omega test: a fast and practical integer programming algorithm for dependence analysis, Proceedings of the 1991 ACM/IEEE conference on Supercomputing , Supercomputing '91, pp.4-13, 1991.
DOI : 10.1145/125826.125848

W. Pugh and D. Wonnacott, Constraint-based array dependence analysis, ACM Transactions on Programming Languages and Systems, vol.20, issue.3, pp.635-678, 1980.
DOI : 10.1145/291889.291900

M. Puschel, B. Singer, J. Xiong, J. M. Moura, J. Johnson et al., Spiral: A Generator for Platform-Adapted Libraries of Signal Processing Algorithms, International Journal of High Performance Computing Applications, vol.18, issue.1, pp.21-45, 2004.
DOI : 10.1177/1094342004041291

J. R. Quinlan, Induction of decision trees, Machine Learning, pp.81-106, 1986.
DOI : 10.1007/BF00116251

E. Rohou, F. Bodin, A. Seznec, G. L. Fol, F. Charot et al., Salto : System for assemblylanguage transformation and optimization, p.71, 1996.
URL : https://hal.archives-ouvertes.fr/inria-00073718

L. D. Rose, T. Hoover, and J. K. Hollingsworth, The dynamic probe class library: An infrastructure for developing instrumentation for performance tools, IEEE Int. Conf. Parallel and Distributed Processing Symp, 2001.

S. Rus, L. Rauchwerger, and J. Hoeflinger, Hybrid analysis, Proceedings of the 16th international conference on Supercomputing , ICS '02, pp.274-284, 2002.
DOI : 10.1145/514191.514229

U. Schwiegelshohn, F. Gasperoni, and K. Ebcioglu, On optimal parallelization of arbitrary loops, Journal of Parallel and Distributed Computing, vol.11, issue.2, pp.130-134, 1991.
DOI : 10.1016/0743-7315(91)90118-S

A. Solar-lezama, G. Arnold, L. Tancau, R. Bodik, V. Saraswat et al., Sketching stencils, ACM SIGPLAN Conf. on Programming Language Design and Implementation, p.79, 2007.

A. Solar-lezama, L. Tancau, R. Bodik, S. Seshia, and V. Saraswat, Combinatorial sketching for finite programs, ACM SIGOPS Operating Systems Review, vol.40, issue.5, pp.404-415, 2006.
DOI : 10.1145/1168917.1168907

A. Srivastava and A. Eustace, Atom -a system for building customized program analysis tools, ACM SIGPLAN Conf. on Programming Language Design and Implementation, pp.196-205, 1994.

T. L. Veldhuizen and A. Lumsdaine, Guaranteed Optimization: Proving Nullspace Properties of Compilers, In Int. Symp. on Static Analysis Lect. Notes in Computer Science, vol.2477, issue.10, pp.263-277, 2002.
DOI : 10.1007/3-540-45789-5_20

F. P. Zaharia and M. Preda, Interactivity, Reactivity and Programmability: Advanced MPEG-4 Multimedia Applications, 2006 Digest of Technical Papers International Conference on Consumer Electronics, pp.441-442, 2006.
DOI : 10.1109/ICCE.2006.1598500

URL : https://hal.archives-ouvertes.fr/hal-00272069

J. Thatcher and J. Wright, Generalized finite automata theory with an application to a decision problem of second-order logic, Mathematical Systems Theory, vol.12, issue.1, pp.57-82, 1968.
DOI : 10.1007/BF01691346

K. B. Theobald, G. R. Gao, and L. J. Hendren, On the Limits of Program Parallelism and its Smoothability, ACM/IEEE Int. Symp. on Microarchitecture, issue.10, pp.10-19, 1992.

S. Thibault, F. Broquedis, B. Goglin, R. Namyst, and P. Wacrenier, An Efficient OpenMP Runtime System for Hierarchical Architectures, International Workshop on OpenMP (IWOMP), pp.148-159, 2007.
DOI : 10.1007/978-3-540-69303-1_19

URL : https://hal.archives-ouvertes.fr/inria-00154502

W. Thies, StreamIt: A Language for Streaming Applications, Programming Languages and Systems Symposium, 2002.
DOI : 10.1007/3-540-45937-5_14

W. Thies, F. Vivien, J. Sheldon, and S. P. Amarasinghe, A unified framework for schedule and storage optimization, ACM SIGPLAN Conf. on Programming Language Design and Implementation, pp.232-242, 2001.
URL : https://hal.archives-ouvertes.fr/hal-00808285

N. Thomas, G. Tanase, O. Tkachyshyn, J. Perdue, N. Amato et al., A framework for adaptive algorithm selection in STAPL, Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '05, pp.31-78, 2005.
DOI : 10.1145/1065944.1065981

N. Thomas, G. Tanase, O. Tkachyshyn, J. Perdue, N. M. Amato et al., A framework for adaptive algorithm selection in STAPL, Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '05, pp.277-288, 2005.
DOI : 10.1145/1065944.1065981

J. Torrellas, A. Tucker, and A. Gupta, Evaluating the Performance of Cache-Affinity Scheduling in Shared-Memory Multiprocessors, J. of Parallel and Distributed Computing, pp.139-151, 1995.
DOI : 10.1006/jpdc.1995.1014

S. Triantafyllis, M. Vachharajani, and D. I. August, Compiler optimization-space exploration, International Symposium on Code Generation and Optimization, 2003. CGO 2003., p.49, 2005.
DOI : 10.1109/CGO.2003.1191546

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.131.1622

N. Vasilache, C. Bastoul, A. Cohen, and S. Girbal, Violated dependence analysis, Proceedings of the 20th annual international conference on Supercomputing , ICS '06, pp.335-344, 2006.
DOI : 10.1145/1183401.1183448

URL : https://hal.archives-ouvertes.fr/hal-01257290

N. Vasilache, A. Cohen, and L. Pouchet, Automatic Correction of Loop Transformations, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007), pp.292-304, 2007.
DOI : 10.1109/PACT.2007.4336220

URL : https://hal.archives-ouvertes.fr/hal-01257283

K. Wang, Precise compile-time performance prediction for superscalar-based computers, ACM SIGPLAN Notices, vol.29, issue.6, pp.73-84, 1994.
DOI : 10.1145/773473.178250

M. Weiser, Program Slicing, IEEE Transactions on Software Engineering, vol.10, issue.4, pp.352-357, 1928.
DOI : 10.1109/TSE.1984.5010248

R. C. Whaley and J. Dongarra, Automatically Tuned Linear Algebra Software, Proceedings of the IEEE/ACM SC98 Conference, p.78, 1998.
DOI : 10.1109/SC.1998.10004

R. C. Whaley, A. Petitet, and J. J. Dongarra, Automated empirical optimizations of software and the ATLAS project, Parallel Computing, vol.27, issue.1-2, pp.3-35, 2000.
DOI : 10.1016/S0167-8191(00)00087-9

D. Whitfield and M. L. Soffa, An approach for exploring code improving transformations, ACM Transactions on Programming Languages and Systems, vol.19, issue.6, pp.1053-1084, 1997.
DOI : 10.1145/267959.267960

L. M. Wills, Automated Program Recognition by Graph Parsing, p.28, 1992.

M. E. Wolf, D. E. Maydan, and D. Chen, Combining loop transformations considering caches and scheduling, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29, pp.479-503, 1998.
DOI : 10.1109/MICRO.1996.566468

M. Wolfe, Iteration space tiling for memory hierarchies, SIAM Conference on Parallel Processing for Scientific Computing, pp.357-361, 1989.

B. Wylie, B. Mohr, and F. Wolf, Holistic hardware counter performance analysis of parallel programs, Parallel Computing, p.81, 2005.

J. Xiong, J. Johnson, R. Johnson, and D. Padua, Spl: A language and compiler for dsp algorithms, ACM SIGPLAN Conf. on Programming Language Design and Implementation, pp.298-308, 2001.

K. Yotov, X. Li, G. Ren, M. Cibulskis, G. Dejong et al., A Comparison of Empirical and Model-driven Optimization, ACM SIGPLAN Conf. on Programming Language Design and Implementation, pp.63-76, 2003.

K. Yotov, X. Li, G. Ren, M. Garzarán, D. Padua et al., Is Search Really Necessary to Generate High-Performance BLASs? Proc, Special issue on " Program Generation, Optimization, and Adaptation, pp.358-386, 2005.

K. Yotov, K. Pingali, and P. Stodghill, X-Ray: A Tool for Automatic Measurement of Architectural Parameters, Intl. Conf. on Quantitative Evaluation of SysTems, p.77, 2005.

W. Zhang, B. Calder, and D. Tullsen, An Event-Driven Multithreaded Dynamic Optimization Framework, International Conference on Parallel Architectures and Compilation Techniques.(PACT), p.73, 2005.

M. Zhao, B. R. Childers, and M. L. Soffa, A Model-Based Framework: An Approach for Profit-driven Optimization, In ACM/IEEE Int. Symp. on Code Optimization and Generation, 2005.