Oceans: Optimizing compilers for embedded applications, Euro-Par Conference, Lect. Notes in Computer Science, p.74, 1997. ,
Optimization manuals ,
Performance Prediction for Loop Restructuring Optimization, 1993. ,
Program Optimization by Template Recognition and Replacement, p.25, 2005. ,
Tema: an efficient tool to find high-performance library patterns in source code, Workshop on Patterns in High Performance Computing, p.31, 1929. ,
Algorithm recognition based on demand-driven data-flow analysis, IEEE Working Conf. on Reverse Engineering, pp.296-305, 2003. ,
Deciding Where to Call Performance Libraries, Euro-Par Conference, pp.336-345, 2005. ,
DOI : 10.1007/11549468_39
URL : https://hal.archives-ouvertes.fr/hal-00141074
On Domain Specific Languages Re-Engineering, In ACM Int. Conf. on Generative Programming and Component Engineering Lect. Notes in Computer Science, vol.3676, issue.22, pp.63-77, 2005. ,
Software pipelining, ACM Computing Surveys, vol.27, issue.3, pp.367-432, 1995. ,
DOI : 10.1145/212094.212131
The Landscape of Parallel Computing Research: A View from Berkeley, p.80, 2006. ,
Dynamo, ACM SIGPLAN Notices, vol.35, issue.5, pp.1-12, 2000. ,
DOI : 10.1145/358438.349303
Analyse du flot des données pour tableaux en présence de contraintes non affines, p.80, 1998. ,
Rapport d'analyse de performances et restructuration d'algorithmes de la chromodynamique quantique sur réseau (lqcd), p.47, 2007. ,
Optimization report for ci-1 and ci-2 codes, 1946. ,
Maximal static expansion, Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '98, pp.98-106, 1998. ,
DOI : 10.1145/268946.268955
URL : https://hal.archives-ouvertes.fr/hal-01257319
Loop optimization using adaptive compilation and kernel decomposition, In ACM/IEEE Int. Symp. on Code Optimization and Generation, pp.170-184, 2007. ,
URL : https://hal.archives-ouvertes.fr/hal-00141056
On the Equivalence of Two Systems of Affine Recurrence Equations, Euro-Par Conference, pp.309-313, 2002. ,
DOI : 10.1007/3-540-45706-2_40
URL : https://hal.archives-ouvertes.fr/inria-00072302
Putting Polyhedral Loop Transformations to Work, Workshop on Languages and Compilers for Parallel Computing (LCPC'03), pp.23-30, 2003. ,
DOI : 10.1007/978-3-540-24644-2_14
URL : https://hal.archives-ouvertes.fr/inria-00071681
Optimizing Matrix Multiply using PHiPAC: A Portable, High-Performance, ANSI C Coding Methodology, ACM Int. Conf. on Supercomputing, p.78, 1997. ,
Iterative compilation in a non-linear optimisation space, Workshop on Profile and Feed-back directed Compilation, p.78, 1998. ,
URL : https://hal.archives-ouvertes.fr/inria-00475919
Symbolic verification with periodic sets, Int. Conf. on Computer-Aided Verification, pp.55-67, 1994. ,
DOI : 10.1007/3-540-58179-0_43
Modeling computation and communication performance of parallel scientific applications: A case study of the ibm sp2, ACM Int. Conf. on Supercomputing, p.56, 1995. ,
A Portable Programming Interface for Performance Evaluation on Modern Processors, International Journal of High Performance Computing Applications, vol.14, issue.3, pp.189-204, 2000. ,
DOI : 10.1177/109434200001400303
An API for Runtime Code Patching, International Journal of High Performance Computing Applications, vol.14, issue.4, pp.317-329, 1994. ,
DOI : 10.1177/109434200001400404
Estimating cache misses and locality using stack distances, Proceedings of the 17th annual international conference on Supercomputing , ICS '03, pp.150-159, 2003. ,
DOI : 10.1145/782814.782836
Value profiling, Proceedings of 30th Annual International Symposium on Microarchitecture, pp.259-269, 1997. ,
DOI : 10.1109/MICRO.1997.645816
Estimating interlock and improving balance for pipelined architectures, Journal of Parallel and Distributed Computing, vol.5, issue.4, pp.334-358, 1988. ,
DOI : 10.1016/0743-7315(88)90002-0
Deep jam: conversion of coarse-grain parallelism to instruction-level and vector parallelism for irregular applications, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05), p.80, 2005. ,
DOI : 10.1109/PACT.2005.16
URL : https://hal.archives-ouvertes.fr/hal-01257293
Combining Models and Guided Empirical Search to Optimize for Multiple Levels of the Memory Hierarchy, International Symposium on Code Generation and Optimization, pp.111-122, 2005. ,
DOI : 10.1109/CGO.2005.10
A Specification Driven Slicing Process for Identifying Reusable Functions, Journal of Software Maintenance: Research and Practice, vol.8, issue.3, pp.145-178, 1996. ,
DOI : 10.1002/(SICI)1096-908X(199605)8:3<145::AID-SMR127>3.0.CO;2-9
Counting solutions to linear and nonlinear constraints through Ehrhart polynomials: Applications to analyze and transform scientific programs, ACM Int. Conf. on Supercomputing, pp.278-295, 1996. ,
URL : https://hal.archives-ouvertes.fr/hal-01100306
Combining analyses, combining optimizations, ACM Transactions on Programming Languages and Systems, vol.17, issue.2, pp.181-196, 1995. ,
DOI : 10.1145/201059.201061
A Polyhedral Approach to Ease the Composition of Program Transformations, Euro-Par Conference, 2004. ,
DOI : 10.1007/978-3-540-27866-5_38
URL : https://hal.archives-ouvertes.fr/hal-01257301
Tile size selection using cache organization and data layout, ACM SIGPLAN Conf. on Programming Language Design and Implementation, pp.279-290, 1995. ,
Tutorial notes on partial evaluation, Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '93, pp.493-501, 1993. ,
DOI : 10.1145/158511.158707
A uniform approach for compile-time and run-time specialization, In Partial Evaluation. International Seminar, pp.54-72, 1996. ,
DOI : 10.1007/3-540-61580-6_4
URL : https://hal.archives-ouvertes.fr/inria-00073917
Tempo: specializing systems applications and beyond, ACM Computing Surveys, vol.30, issue.3es, p.72, 1998. ,
DOI : 10.1145/289121.289140
Vizer: A system to vectorize intel x86 binaries, Intl. Symp. on Computer Architecture, p.64, 2002. ,
Adaptive Optimizing Compilers for the 21st Century, The Journal of Supercomputing, vol.23, issue.1, pp.7-22, 2002. ,
DOI : 10.1023/A:1015729001611
Investigating Adaptive Compilation Using the Mipspro Compiler, Los Alamos Computer Science Institute Symp, p.49, 2003. ,
DOI : 10.1177/1094342005056142
Automatic discovery of linear restraints among variables of a program, Proceedings of the 5th ACM SIGACT-SIGPLAN symposium on Principles of programming languages , POPL '78, pp.84-97, 1978. ,
DOI : 10.1145/512760.512770
Exact versus approximate array region analyses, Int. Workshop on Languages and Compilers for Parallel Computing, pp.86-100, 1996. ,
DOI : 10.1007/BFb0017247
Compositional Approach Applied to Loop Specialization, Euro-Par Conference, pp.268-279, 2007. ,
DOI : 10.1007/978-3-540-74466-5_30
URL : https://hal.archives-ouvertes.fr/hal-00575934
Exploring application performance: a new tool for a static/dynamic approach, Los Alamos Computer Science Institute Symp, 1957. ,
URL : https://hal.archives-ouvertes.fr/hal-00141071
Exploring application performance: a new tool for a static/dynamic approach, Proceedings of the 6th LACSI Symposium, p.68, 2005. ,
URL : https://hal.archives-ouvertes.fr/hal-00141071
The design and architecture of maqaoprofile: an instrumentation maqao module, Workshop on Explicitly Parallel Instruction Computing Techniques, 1957. ,
Optimisation itrative de bibliothque de calculs par division hirarchique de codes, p.33, 2007. ,
A Language for the Compact Representation of Multiple Program Versions, Int. Workshop on Languages and Compilers for Parallel Computing, p.35, 2005. ,
DOI : 10.1007/978-3-540-69330-7_10
URL : https://hal.archives-ouvertes.fr/hal-00141067
Decidability of the weak second-order theory of two successors. Notices Amer, Math. Soc, vol.12, pp.365-468, 1922. ,
Tree acceptors and some of their applications, Journal of Computer and System Sciences, vol.4, issue.5, pp.406-451, 1970. ,
DOI : 10.1016/S0022-0000(70)80041-1
Optimizing software data prefetches with rotating registers, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques, p.72, 2001. ,
DOI : 10.1109/PACT.2001.953306
Code Instruction Selection Based on SSA-Graphs, Int. Workshop Software and Compilers for Embedded Systems, p.26, 2003. ,
DOI : 10.1007/978-3-540-39920-9_5
Quantifying the Impact of Input Data Sets on Program Behavior and its Applications, J. of Instruction-Level Parallelism, vol.5, 2003. ,
DCG: An efficient, retargetable dynamic code generation system, Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, pp.263-272, 1994. ,
Optimal code selection in DAGs, Proceedings of the 26th ACM SIGPLAN-SIGACT symposium on Principles of programming languages , POPL '99, p.26, 1999. ,
DOI : 10.1145/292540.292562
Parametric integer programming, RAIRO - Operations Research, vol.22, issue.3, pp.243-268, 1988. ,
DOI : 10.1051/ro/1988220302431
Dataflow analysis of array and scalar references, International Journal of Parallel Programming, vol.24, issue.4, pp.23-53, 1991. ,
DOI : 10.1007/BF01407931
Refactoring: Improving the Design of Existing Code, p.20, 1999. ,
DOI : 10.1007/3-540-45672-4_31
Automatic analytical modeling for the estimation of cache misses, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425), p.49, 1999. ,
DOI : 10.1109/PACT.1999.807544
Phase Ordering of Register Allocation and Instruction Scheduling, Code Generation ? Concepts, Tools, Techniques. Proceedings of the International Workshop on Code Generation, pp.146-172, 1992. ,
DOI : 10.1007/978-1-4471-3501-2_9
A Fast Fourier Transform Compiler, ACM SIGPLAN Conf. on Programming Language Design and Implementation, p.12, 1999. ,
FFTW: an adaptive software architecture for the FFT, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181), pp.1381-1384, 1998. ,
DOI : 10.1109/ICASSP.1998.681704
Peridot: Towards Automated Runtime Detection of Performance Bottlenecks, High Performance Computing in Science and Engineering, pp.193-202, 2005. ,
DOI : 10.1007/3-540-28555-5_17
Quick and practical run-time evaluation of multiple program optimizations. Transactions on High-Performance Embedded Architectures and Compilers, pp.13-31, 2006. ,
URL : https://hal.archives-ouvertes.fr/inria-00084110
Simplifying reductions, In ACM Symp. on Principles of Programming Languages, pp.30-41, 2006. ,
The z-polyhedral model, ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, pp.237-248, 2007. ,
Exploiting coarse-grained task, data, and pipeline parallelism in stream programs, ASPLOS, 2006. ,
On reducing tlb misses in matrix multiplication, p.49, 2002. ,
Annotation-Directed Run-Time Specialization in C, Proceedings of the ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation (PEPM'97), pp.163-178, 1997. ,
DyC: an expressive annotation-directed dynamic compiler for C, Theoretical Computer Science, vol.248, issue.1-2, p.73, 1999. ,
DOI : 10.1016/S0304-3975(00)00051-7
On index set splitting, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425), pp.607-631, 2000. ,
DOI : 10.1109/PACT.1999.807572
Optimizing performance on modern hpc systems: learning from simple kernel benchmarks. Computational Science and High Performance Computing, pp.273-287, 2006. ,
Lustre / A declarative language for programming synchronous systems, ACM Symp. on Principles of Programming Languages, vol.215, issue.7, pp.178-188, 1967. ,
SPEC CPU2000: measuring CPU performance in the New Millennium, Computer, vol.33, issue.7, pp.28-35, 2000. ,
DOI : 10.1109/2.869367
A unification algorithm for typed ??-calculus, Theoretical Computer Science, vol.1, issue.1, pp.27-57, 1975. ,
DOI : 10.1016/0304-3975(75)90011-0
Hp caliper: An architecture for performance analysis tools, Workshop on Industrial Experiences with Systems Software, 2000. ,
Lua ? an extensible extension language. Software: Pratice and Experience, pp.635-652, 1996. ,
DOI : 10.1002/(sici)1097-024x(199606)26:6<635::aid-spe26>3.0.co;2-p
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.2941
Rapport d'analyse de performances et restructuration pour une application de modlisation molculaire, p.47, 2007. ,
The Art of Computer Systems Performance Analysis : Techniques for Experimental Design, Measurement, Simulation, and Modeling, 1991. ,
WBTK: a New Set of Microbenchmarks to Explore Memory System Performance for Scientific Computing, International Journal of High Performance Computing Applications, vol.18, issue.2, pp.211-224, 2004. ,
DOI : 10.1177/1094342004038945
The Omega Library Interface Guide, p.42, 1996. ,
Telescoping languages: a compiler strategy for implementation of high-level domain-specific programming systems, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000, pp.297-304, 2000. ,
DOI : 10.1109/IPDPS.2000.845999
Static Performance Estimation in a Parallelizing Compiler, 1992. ,
An Effective Automated Approach to Specialization of Code, Int. Workshop on Languages and Compilers for Parallel Computing, p.66, 2007. ,
DOI : 10.1007/978-3-540-85261-2_21
Improving performance through low-overhead specialization of code, Int. Workshop on Iteraction between Compilers and Computer Architectures, p.66, 2007. ,
A hybrid approach for program understanding based on graph parsing and expectation-driven analysis, Applied Artificial Intelligence, vol.12, issue.6, pp.521-546, 1998. ,
DOI : 10.1080/088395198117659
Iterative compilation in program optimization, CPC, pp.35-44, 2000. ,
Data-centric Multi-level Blocking, ACM SIGPLAN Conf. on Programming Language Design and Implementation, p.37, 1997. ,
Transformations for imperfectly nested loops, Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM) , Supercomputing '96, p.37, 1996. ,
DOI : 10.1145/369028.369051
Finding effective compilation sequences, Conf. on Languages, Compilers, and Tools for Embedded Systems, 2004. ,
Whole program paths, ACM SIGPLAN Conf. on Programming Language Design and Implementation, pp.259-269, 1999. ,
Eel: Machine-independent executable editing, ACM SIGPLAN Conf. on Programming Language Design and Implementation, p.71, 1995. ,
LLVM: A compilation framework for lifelong program analysis & transformation, International Symposium on Code Generation and Optimization, 2004. CGO 2004., p.31, 2004. ,
DOI : 10.1109/CGO.2004.1281665
Automatic storage management for parallel programs, Parallel Computing, vol.24, issue.3-4, pp.649-671, 1938. ,
DOI : 10.1016/S0167-8191(98)00029-5
Dynamo : A staged compiler architecture for dynamic program optimization, p.73, 1997. ,
Optimizing ml with run-time code generation, p.66, 1995. ,
A Declarative Approach to Run-Time Code Generation, Workshop on Compiler Support for System Software (WCSSS), p.73, 1996. ,
Dynamic specialization in the Fabius system, ACM Computing Surveys, vol.30, issue.3es, p.73, 1998. ,
DOI : 10.1145/289121.289144
A dynamically tuned sorting library, In ACM/IEEE Int. Symp. on Code Optimization and Generation, p.78, 2004. ,
Design and Implementation of a Lightweight Dynamic Optimization System, Journal of Instruction-Level Parallelism, vol.6, p.73, 2004. ,
Specializing c-an introduction to the principles behind c-mix, p.73, 1999. ,
Approaching a machine-application bound in delivered performance on scientifique code, Proc. of the IEEE, pp.1166-1178, 1993. ,
PAP Recognizer: a tool for automatic recognition of parallelizable patterns, WPC '96. 4th Workshop on Program Comprehension, pp.164-174, 1996. ,
DOI : 10.1109/WPC.1996.501131
Elimination of quantifiers from arithmetical formulas defining recursively enumerable sets, Mathematics and Computers in Simulation, vol.67, issue.1-2, pp.125-133, 2004. ,
DOI : 10.1016/j.matcom.2004.05.013
Array dataflow analysis and its use in array privatization, In ACM Symp. on Principles of Programming Languages, pp.2-15, 1924. ,
Hpcview: A tool for top-down analysis of node performance, Los Alamos Computer Science Institute Symp, p.77, 2001. ,
The java hotspot performance engine: An in-depth look, p.66, 1999. ,
An overview of the impact x86 binary reoptimization framework, p.64, 1998. ,
Automatic Algorithm Recognition: A New Approach to Program Optimization, p.29, 2000. ,
Machine Learning, p.78, 1997. ,
FINESSE: a prototype feedback-guided performance enhancement system, Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing, p.71, 2000. ,
DOI : 10.1109/EMPDP.2000.823400
Automatic, template-based run-time specialization: implementation and experimental study, Proceedings of the 1998 International Conference on Computer Languages (Cat. No.98CB36225), p.72, 1998. ,
DOI : 10.1109/ICCL.1998.674164
Pinpointing Representative Portions of Large Intel?? Itanium?? Programs with Dynamic Instrumentation, 37th International Symposium on Microarchitecture (MICRO-37'04), p.71, 2004. ,
DOI : 10.1109/MICRO.2004.28
A framework for source code search using program patterns, IEEE Transactions on Software Engineering, vol.20, issue.6, pp.463-475, 1928. ,
DOI : 10.1109/32.295894
Ccg: Dynamic code generation for c and c++, INRIA Rocquencourt, vol.66, p.73, 2003. ,
An Offline Approach for Whole-Program Paths Analysis Using Suffix Arrays, Int. Workshop on Languages and Compilers for Parallel Computing, pp.363-378, 2004. ,
DOI : 10.1007/11532378_26
C and tcc: a language and compiler for dynamic code generation, ACM Transactions on Programming Languages and Systems, vol.21, issue.2, pp.324-369, 1999. ,
DOI : 10.1145/316686.316697
GRAPHITE: Loop optimizations based on polyhedral model for gcc, GCC Developer's Summit, p.80, 2006. ,
URL : https://hal.archives-ouvertes.fr/hal-01257284
Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time, International Symposium on Code Generation and Optimization (CGO'07), pp.144-156, 2007. ,
DOI : 10.1109/CGO.2007.21
URL : https://hal.archives-ouvertes.fr/hal-01257281
The Omega test: a fast and practical integer programming algorithm for dependence analysis, Proceedings of the 1991 ACM/IEEE conference on Supercomputing , Supercomputing '91, pp.4-13, 1991. ,
DOI : 10.1145/125826.125848
Constraint-based array dependence analysis, ACM Transactions on Programming Languages and Systems, vol.20, issue.3, pp.635-678, 1980. ,
DOI : 10.1145/291889.291900
Spiral: A Generator for Platform-Adapted Libraries of Signal Processing Algorithms, International Journal of High Performance Computing Applications, vol.18, issue.1, pp.21-45, 2004. ,
DOI : 10.1177/1094342004041291
Induction of decision trees, Machine Learning, pp.81-106, 1986. ,
DOI : 10.1007/BF00116251
Salto : System for assemblylanguage transformation and optimization, p.71, 1996. ,
URL : https://hal.archives-ouvertes.fr/inria-00073718
The dynamic probe class library: An infrastructure for developing instrumentation for performance tools, IEEE Int. Conf. Parallel and Distributed Processing Symp, 2001. ,
Hybrid analysis, Proceedings of the 16th international conference on Supercomputing , ICS '02, pp.274-284, 2002. ,
DOI : 10.1145/514191.514229
On optimal parallelization of arbitrary loops, Journal of Parallel and Distributed Computing, vol.11, issue.2, pp.130-134, 1991. ,
DOI : 10.1016/0743-7315(91)90118-S
Sketching stencils, ACM SIGPLAN Conf. on Programming Language Design and Implementation, p.79, 2007. ,
Combinatorial sketching for finite programs, ACM SIGOPS Operating Systems Review, vol.40, issue.5, pp.404-415, 2006. ,
DOI : 10.1145/1168917.1168907
Atom -a system for building customized program analysis tools, ACM SIGPLAN Conf. on Programming Language Design and Implementation, pp.196-205, 1994. ,
Guaranteed Optimization: Proving Nullspace Properties of Compilers, In Int. Symp. on Static Analysis Lect. Notes in Computer Science, vol.2477, issue.10, pp.263-277, 2002. ,
DOI : 10.1007/3-540-45789-5_20
Interactivity, Reactivity and Programmability: Advanced MPEG-4 Multimedia Applications, 2006 Digest of Technical Papers International Conference on Consumer Electronics, pp.441-442, 2006. ,
DOI : 10.1109/ICCE.2006.1598500
URL : https://hal.archives-ouvertes.fr/hal-00272069
Generalized finite automata theory with an application to a decision problem of second-order logic, Mathematical Systems Theory, vol.12, issue.1, pp.57-82, 1968. ,
DOI : 10.1007/BF01691346
On the Limits of Program Parallelism and its Smoothability, ACM/IEEE Int. Symp. on Microarchitecture, issue.10, pp.10-19, 1992. ,
An Efficient OpenMP Runtime System for Hierarchical Architectures, International Workshop on OpenMP (IWOMP), pp.148-159, 2007. ,
DOI : 10.1007/978-3-540-69303-1_19
URL : https://hal.archives-ouvertes.fr/inria-00154502
StreamIt: A Language for Streaming Applications, Programming Languages and Systems Symposium, 2002. ,
DOI : 10.1007/3-540-45937-5_14
A unified framework for schedule and storage optimization, ACM SIGPLAN Conf. on Programming Language Design and Implementation, pp.232-242, 2001. ,
URL : https://hal.archives-ouvertes.fr/hal-00808285
A framework for adaptive algorithm selection in STAPL, Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '05, pp.31-78, 2005. ,
DOI : 10.1145/1065944.1065981
A framework for adaptive algorithm selection in STAPL, Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '05, pp.277-288, 2005. ,
DOI : 10.1145/1065944.1065981
Evaluating the Performance of Cache-Affinity Scheduling in Shared-Memory Multiprocessors, J. of Parallel and Distributed Computing, pp.139-151, 1995. ,
DOI : 10.1006/jpdc.1995.1014
Compiler optimization-space exploration, International Symposium on Code Generation and Optimization, 2003. CGO 2003., p.49, 2005. ,
DOI : 10.1109/CGO.2003.1191546
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.131.1622
Violated dependence analysis, Proceedings of the 20th annual international conference on Supercomputing , ICS '06, pp.335-344, 2006. ,
DOI : 10.1145/1183401.1183448
URL : https://hal.archives-ouvertes.fr/hal-01257290
Automatic Correction of Loop Transformations, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007), pp.292-304, 2007. ,
DOI : 10.1109/PACT.2007.4336220
URL : https://hal.archives-ouvertes.fr/hal-01257283
Precise compile-time performance prediction for superscalar-based computers, ACM SIGPLAN Notices, vol.29, issue.6, pp.73-84, 1994. ,
DOI : 10.1145/773473.178250
Program Slicing, IEEE Transactions on Software Engineering, vol.10, issue.4, pp.352-357, 1928. ,
DOI : 10.1109/TSE.1984.5010248
Automatically Tuned Linear Algebra Software, Proceedings of the IEEE/ACM SC98 Conference, p.78, 1998. ,
DOI : 10.1109/SC.1998.10004
Automated empirical optimizations of software and the ATLAS project, Parallel Computing, vol.27, issue.1-2, pp.3-35, 2000. ,
DOI : 10.1016/S0167-8191(00)00087-9
An approach for exploring code improving transformations, ACM Transactions on Programming Languages and Systems, vol.19, issue.6, pp.1053-1084, 1997. ,
DOI : 10.1145/267959.267960
Automated Program Recognition by Graph Parsing, p.28, 1992. ,
Combining loop transformations considering caches and scheduling, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29, pp.479-503, 1998. ,
DOI : 10.1109/MICRO.1996.566468
Iteration space tiling for memory hierarchies, SIAM Conference on Parallel Processing for Scientific Computing, pp.357-361, 1989. ,
Holistic hardware counter performance analysis of parallel programs, Parallel Computing, p.81, 2005. ,
Spl: A language and compiler for dsp algorithms, ACM SIGPLAN Conf. on Programming Language Design and Implementation, pp.298-308, 2001. ,
A Comparison of Empirical and Model-driven Optimization, ACM SIGPLAN Conf. on Programming Language Design and Implementation, pp.63-76, 2003. ,
Is Search Really Necessary to Generate High-Performance BLASs? Proc, Special issue on " Program Generation, Optimization, and Adaptation, pp.358-386, 2005. ,
X-Ray: A Tool for Automatic Measurement of Architectural Parameters, Intl. Conf. on Quantitative Evaluation of SysTems, p.77, 2005. ,
An Event-Driven Multithreaded Dynamic Optimization Framework, International Conference on Parallel Architectures and Compilation Techniques.(PACT), p.73, 2005. ,
A Model-Based Framework: An Approach for Profit-driven Optimization, In ACM/IEEE Int. Symp. on Code Optimization and Generation, 2005. ,