J. Ahn and K. Choi, Lower-bits cache for low power STT-RAM caches, 2012 IEEE International Symposium on Circuits and Systems, pp.480-483, 2012.

A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman, Compilers: Principles, Techniques, and Tools, p.321486811, 2006.

M. Avila, M. Glaizot, and I. Puaut, Impact of Automatic Gain Time Identification on Tree-Based Static WCET Analysis, Proceedings of 3rd International Workshop on WCET 2003-a Sattelite Event to ECRTS 2003, 2003.

D. F. Bacon, S. L. Graham, and O. J. Sharp, Compiler Transformations for High-performance Computing, Comput. Surv, vol.26, pp.345-420, 1994.

B. Gordon, K. M. Bell, . Lepak, H. Mikko, and . Lipasti, Characterization of silent stores, International Conference on Parallel Architectures and Compilation Techniques (PACT), 2000.

L. Benini, L. Macchiarulo, A. Macii, and M. Poncino, Layout-driven memory synthesis for embedded systems-on-chip, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.10, pp.96-105, 2002.

L. Benini, A. Macii, E. Macii, and M. Poncino, Increasing energy efficiency of embedded systems by application-specific memory hierarchy generation, IEEE Design Test of Computers, vol.17, pp.74-85, 2000.

N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi et al., The Gem5 Simulator". In: SIGARCH Comput. Archit. News, vol.39, pp.1-7, 2011.

R. Bishnoi, M. Ebrahimi, F. Oboril, and M. B. Tahoori, Asynchronous Asymmetrical Write Termination (AAWT) for a low power STT-MRAM, p.2014

. Design, Automation Test in Europe Conference Exhibition (DATE), pp.1-6, 2014.

J. Boukhobza, S. Rubini, R. Chen, and Z. Shao, Emerging NVM: A Survey on Architectural Integration and Research Challenges, In: ACM Trans. Design Autom. Electr. Syst, vol.23, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01709571

R. Bouziane, E. Rohou, and A. Gamatié, How could compile-time program analysis help leveraging emerging NVM features, 2017 First International Conference on Embedded Distributed Systems (EDiS, pp.1-6, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01655195

R. Bouziane, E. Rohou, and A. Gamatié, Compile-Time Silent-Store Elimination for Energy Efficiency: an Analytic Evaluation for Non-Volatile Cache Memory, Proceedings of the RAPIDO 2018 Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools, vol.5, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01660686

R. Bouziane, E. Rohou, and A. Gamatié, Energy-Efficient Memory Mappings based on Partial WCET Analysis and Multi-Retention Time STT-RAM, RTNS 2018 -26th International Conference on Real-Time Networks and Systems, pp.1-11, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01871320

A. Butko, F. Bruguier, A. Gamatié, G. Sassatelli, D. Novo et al., Full-System Simulation of big.LITTLE Multicore Architecture for Performance and Energy Exploration, 2016 IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSOC), pp.201-208, 2016.
URL : https://hal.archives-ouvertes.fr/lirmm-01418745

S. Che, J. W. Sheaffer, M. Boyer, L. G. Szafaryn, L. Wang et al., A Characterization of the Rodinia Benchmark Suite with Comparison to Contemporary CMP Workloads, International Symposium on Workload Characterization (IISWC'10), pp.978-979, 2010.

Y. Chen, J. Cong, H. Huang, B. Liu, C. Liu et al., Dynamically reconfigurable hybrid cache: An energy-efficient last-level cache design, 2012 Design, Automation Test in Europe Conference Exhibition (DATE), pp.45-50, 2012.

Y. Chen, J. Cong, H. Huang, C. Liu, R. Prabhakar et al., Static and dynamic co-optimizations for blocks mapping in hybrid caches, Proceedings of the 2012 international symposium on Low power electronics and design, pp.237-242, 2012.

W. Cheng, Y. Ciou, and P. Shen, Architecture and data migration methodology for L1 cache design with hybrid SRAM and volatile STT-RAM configuration, Microprocessors and Microsystems, vol.42, 2016.

T. Delobelle, P. Péneau, A. Gamatié, F. Bruguier, S. Senni et al., MAGPIE: System-level Evaluation of Manycore Systems with Emerging Memory Technologies, Workshop on Emerging Memory Solutions -Technology, Manufacturing, Architectures, Design and Test at Design Automation and Test in Europe, 2017.
URL : https://hal.archives-ouvertes.fr/lirmm-01467328

X. Dong, C. Xu, Y. Xie, and N. , Nvsim: A circuitlevel performance, energy, and area model for emerging nonvolatile memory". In: Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, vol.31, pp.994-1007, 2012.

R. Gonzalez and M. Horowitz, Energy dissipation in general purpose microprocessors, IEEE Journal of solid-state circuits, vol.31, pp.1277-1284, 1996.

N. Goswami, B. Cao, and T. Li, Power-performance co-optimization of throughput core architecture using resistive memory, High Performance Computer Architecture (HPCA), pp.342-353, 2013.

X. Guo, E. Ipek, and T. Soyata, Resistive Computation: Avoiding the Power Wall with Low-leakage, STT-MRAM Based Computing, In: SIGARCH Comput. Archit. News, vol.38, issue.3, pp.371-382, 2010.

J. Gustafsson, A. Betts, A. Ermedahl, and B. Lisper, The Mälardalen WCET benchmarks: Past, present and future, 2010.

H. Hajimiri, K. Rahmani, and P. Mishra, Synergistic integration of dynamic cache reconfiguration and code compression in embedded systems, 2011 International Green Computing Conference and Workshops, pp.1-8, 2011.

. Hardkernel and . Odroid,

D. Hardy, B. Rouxel, and I. Puaut, The Heptane Static Worst-Case Execution Time Estimation Tool, 17th International Workshop on Worst-Case Execution Time Analysis (WCET), vol.8, p.12, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01590444

N. Holsti and S. Saarinen, Status of the Bound-T WCET tool, 2002.

J. Hu, C. J. Xue, W. Tseng, Y. He, M. Qiu et al., Reducing Write Activities on Non-volatile Memories in Embedded CMPs via Data Migration and Recomputation, Design Automation Conference (DAC'10), 2010.

J. Hu, C. J. Xue, Q. Zhuge, W. Tseng, and E. Sha, Data Allocation Optimization for Hybrid Scratch Pad Memory With SRAM and Nonvolatile Memory, Trans. VLSI Syst, 2013.

S. Hua and G. Qu, Approaching the maximum energy saving on embedded systems with multiple voltages, ICCAD-2003. International Conference on Computer Aided Design, pp.26-29, 2003.

A. Maha-idrissi and O. Zendra, A Survey of Scratch-Pad Memory Management Techniques for low-power and -energy". In: 2nd ECOOP Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems (ICOOOLPS'2007)

M. Cebulla, , pp.31-38, 2007.

K. Ikegami, C. Noguchi, M. Kamata, . Amano, . Abe et al., A 4ns, 0.9V write voltage embedded perpendicular STT-MRAM fabricated by MTJ-Last process, Procedings of Technical Progarm-2014 International Symposium on VLSI Technology, Systems and Application, pp.1-2, 2014.

M. Jacobs, S. Hahn, and S. Hack, WCET Analysis for Multi-core Processors with Shared Buses and Event-driven Bus Arbitration, Proceedings of the 23rd International Conference on Real Time and Networks Systems. RTNS, pp.193-202, 2015.

A. Jadidi, M. Arjomand, and H. Sarbazi-azad, High-endurance and performanceefficient design of hybrid cache architectures through adaptive line replacement, IEEE/ACM International Symposium on Low Power Electronics and Design, pp.79-84, 2011.

A. Jog, K. Asit, C. Mishra, Y. Xu, V. Xie et al., Cache revive: architecting volatile STT-RAM caches for enhanced performance in CMPs, Annual Design Automation Conference DAC, 2012.

Y. Joo, D. Niu, X. Dong, G. Sun, N. Chang et al., Energy-and endurance-aware design of phase change memory caches, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010) (2010), pp.136-141

J. Jung, Y. Nakata, M. Yoshimoto, and H. Kawaguchi, Energy-efficient Spin-Transfer Torque RAM cache exploiting additional all-zero-data flags, International Symposium on Quality Electronic Design (ISQED), pp.216-222, 2013.

N. Mahmut-kandemir, M. J. Vijaykrishnan, and . Irwin, Compiler Optimizations for Low Power Systems, 2002.

N. Mahmut-kandemir, M. J. Vijaykrishnan, W. Irwin, and . Ye, Influence of Compiler Optimizations on System Power, IEEE Trans. Very Large Scale Integr. Syst, vol.9, issue.6, pp.1063-8210, 2001.

T. Mahmut, I. Kandemir, I. Kolcu, and . Kadayif, Influence of Loop Optimizations on Energy Consumption of Multi-bank Memory Systems, Proceedings of the 11th International Conference on Compiler Construction. CC '02

U. K. London and . Uk, , pp.276-292, 2002.

T. Mahmut, I. Kandemir, I. Kolcu, and . Kadayif, Influence of Loop Optimizations on Energy Consumption of Multi-bank Memory Systems, Proceedings of the 11th International Conference on Compiler Construction, vol.2304

, Lecture Notes in Computer Science, pp.276-292, 2002.

S. H. Kang and K. Lee, Emerging materials and devices in spintronic integrated circuits for energy-smart mobile computing and connectivity, Acta Materialia, vol.61, pp.952-973, 2013.

S. Kaxiras, Z. Hu, and M. Martonosi, Cache decay: exploiting generational behavior to reduce cache leakage power, Proceedings 28th Annual International Symposium on Computer Architecture, pp.240-251, 2001.

N. Khoshavi, X. Chen, J. Wang, and R. F. Demara, ReadTuned STT-RAM and eDRAM Cache Hierarchies for Throughput and Energy Enhancement, 2016.

D. A-v-khvalkovskiy, . Apalkov, . Watts, R. Chepulskii, . Beach et al., Basic principles of STT-MRAM cell operation in memory arrays, Journal of Physics D: Applied Physics, vol.46, 2013.

V. Kianzad, S. S. Bhattacharyya, and G. Qu, CASPER: an integrated energydriven approach for task graph scheduling on distributed embedded systems, 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors (ASAP'05, pp.191-197, 2005.

T. Kisuki, M. W. Peter, M. F. Knijnenburg, and . O'boyle, Combined Selection of Tile Sizes and Unroll Factors Using Iterative Compilation, International Conference on Parallel Architectures and Compilation Techniques (PACT)

, IEEE Computer Society, pp.0-7695, 2000.

K. Kwon, S. H. Choday, Y. Kim, and K. Roy, AWARE (Asymmetric Write Architecture With REdundant Blocks): A High Write Speed STT-MRAM Cache Architecture, IEEE Transactions on Very Large Scale Integration (VLSI) Systems 22, vol.4, pp.712-720, 2014.

C. Lattner and V. Adve, LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation, International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization

, CGO '04, pp.0-7695, 2004.

C. Layer, L. Becker, K. Jabeur, S. Claireux, B. Dieny et al., Reducing System Power Consumption Using Check-Pointing on Nonvolatile Embedded Magnetic Random Access Memories, ACM Journal on Emerging Technologies in Computing Systems (JETC), vol.12, p.32, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01835848

C. Benjamin, E. Lee, O. Ipek, D. Mutlu, and . Burger, Architecting Phase Change Memory As a Scalable Dram Alternative, SIGARCH Comput

, Archit. News, vol.37, issue.3, pp.2-13, 2009.

D. Lee and D. Blaauw, Static Leakage Reduction Through Simultaneous Threshold Voltage and State Assignment, Proceedings of the 40th

, Annual Design Automation Conference. DAC '03, pp.191-194, 2003.

M. Kevin, G. B. Lepak, . Bell, H. Mikko, and . Lipasti, Silent stores and store value locality, IEEE Transactions on Computers, vol.50, p.11, 2001.

K. M. Lepak and M. H. Lipasti, On the Value Locality of Store Instructions, International Symposium on Computer Architecture (ISCA'2000), pp.182-191, 2000.

D. Li, P. H. Chou, and N. Bagherzadeh, Mode selection and modedependency modeling for power-aware embedded systems, Proceedings of ASP-DAC/VLSI Design 2002. 7th Asia and South Pacific Design Automation Conference and 15h International Conference on VLSI Design, pp.697-704, 2002.

H. Li, I. Puaut, and E. Rohou, Traceability of Flow Information: Reconciling Compiler Optimizations and WCET Estimation, RTNS -22nd International Conference on Real-Time Networks and Systems, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01072138

J. Li, L. Shi, Q. Li, C. J. Xue, Y. Chen et al., Cache coherence enabled adaptive refresh for volatile STT-RAM, 2013 Design, Automation Test in Europe Conference Exhibition (DATE), pp.1247-1250, 2013.

J. Li, L. Shi, C. J. Xue, C. Yang, and Y. Xu, Exploiting set-level write nonuniformity for energy-efficient NVM-based hybrid cache, 2011 9th IEEE Symposium on Embedded Systems for Real-Time Multimedia, pp.19-28, 2011.

J. Li, C. J. Xue, and Y. Xu, STT-RAM based energyefficiency hybrid cache for CMPs, Int. Conf. on VLSI and SoC, vol.11

H. Kowloon and . Kong, , 2011.

Y. Qing'an-li, J. He, L. Li, Y. Shi, C. J. Chen et al., Compiler-Assisted Refresh Minimization for Volatile STT-RAM Cache, In: IEEE Trans. Computers, vol.64, pp.2169-2181, 2015.

Q. Li, J. Li, L. Shi, C. J. Xue, and Y. He, MAC: Migration-aware Compilation for STT-RAM Based Hybrid Cache in Embedded Systems, Int. Symp. on Low Power Electronics and Design (ISLPED), 2012.

Q. Li, J. Li, L. Shi, M. Zhao, C. J. Xue et al., Compiler-assisted STT-RAM-based hybrid cache for energy efficient embedded systems, Transactions on Very Large Scale Integration (VLSI) Systems, vol.22, 2014.

Q. Li, L. Shi, J. Li, C. J. Xue, and Y. He, Code Motion for Migration Minimization in STT-RAM Based Hybrid Cache, Computer Society Annual Symposium on VLSI, 2012.

Q. Li, M. Zhao, C. J. Xue, and Y. He, Compilerassisted Preferred Caching for Embedded Systems with STT-RAM Based Hybrid Cache, ACM International Conference on Languages, Compilers, Tools and Theory for Embedded Systems (LCTES '12), 2012.

S. Li, J. Ho-ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen et al., McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures, Proceedings of the 42nd Annual International Symposium on Microarchitecture. MICRO 42

Y. Li, Y. Chen, and A. K. Jones, A Software Approach for Combating Asymmetries of Non-volatile Memories, Proceedings of the 2012 ACM/IEEE International Symposium on Low Power Electronics and Design. ISLPED '12, pp.191-196, 2012.

Y. Li and A. , Cross-layer techniques for optimizing systems utilizing memories with asymmetric access characteristics, VLSI (ISVLSI), pp.404-409, 2012.

Y. Li, Y. Zhang, L. I. Hai, Y. Chen, and A. K. Jones, C1C: A Configurable, Compiler-guided STT-RAM L1 Cache, In: ACM Trans. Archit. Code Optim, vol.10, 2013.

I. Lin and J. Chiou, High-Endurance Hybrid Cache Design in CMP Architecture With Cache Partitioning and Access-Aware Policies, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.23, pp.2149-2161, 2015.

P. Marchal, J. I. Gómez, and F. Catthoor, Optimizing the Memory Bandwidth with Loop Fusion, Proceedings of the 2nd International Conference on Hardware/Software Codesign and System Synthesis. CODES+ISSS '04, pp.1-58113, 2004.

L. Mcvoy and C. Staelin, Lmbench: Portable Tools for Performance Analysis, USENIX Annual Technical Conference, pp.23-23, 1996.

R. Mehta, R. M. Owens, M. J. Irwin, R. Chen, and D. Ghosh, Techniques for low energy software, Proceedings of 1997 International Symposium on Low Power Electronics and Design, pp.72-75, 1997.

M. De-michiel, A. Bonenfant, C. Ballabriga, and H. Cassé, Partial Flow Analysis with oRange, International Symposium On Leveraging Applications of Formal Methods, Verification and Validation, vol.6416, pp.479-482, 2010.

S. Mittal, J. S. Vetter, and D. Li, LastingNVCache: A Technique for Improving the Lifetime of Non-volatile Caches, IEEE Computer Society Annual Symposium on VLSI, pp.534-540, 2014.

S. Mittal, J. Vetter, and D. Li, WriteSmoothing: Improving Lifetime of Non-volatile Caches Using Intra-set Wear-leveling, Proceedings of the ACM Great Lakes Symposium on VLSI, pp.139-144, 2014.

S. Mittal and J. S. Vetter, A Survey of Software Techniques for Using Non-Volatile Memories for Storage and Main Memory Systems, In: Trans. Parallel Distrib. Syst, vol.27, 2016.

S. Mittal and J. S. Vetter, EqualChance: Addressing Intra-set Write Variation to Increase Lifetime of Non-volatile Caches, 2nd USENIX Workshop on Interactions of NVM/Flash with Operating Systems and Workloads (IN-FLOW), 2014.
URL : https://hal.archives-ouvertes.fr/hal-01104645

S. S. Muchnick, Advanced Compiler Design and Implementation, 1997.

L. Niu and G. Quan, Reducing Both Dynamic and Leakage Energy Consumption for Hard Real-time Systems, Proceedings of the 2004 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems. CASES '04, pp.140-148, 2004.

H. Noguchi, K. Ikegami, K. Kushida, K. Abe, S. Itai et al., A 3.3ns-access-time 71.2µW/MHz 1Mb embedded STT-MRAM using physically eliminated read-disturb scheme and normally-off memory architecture, pp.1-3, 2015.

D. Oehlert, S. Saidi, and H. Falk, Compiler-based Extraction of Event Arrival Functions for Real-Time Systems Analysis, 30th Euromicro Conference on Real-Time Systems, vol.4, 2018.

H. Ozaktas, C. Rochange, and P. Sainrat, Minimizing the Cost of Synchronisations in the WCET of Real-time Parallel Programs, Proceedings of the 17th International Workshop on Software and Compilers for Embedded Systems. SCOPES '14, pp.98-107, 2014.

X. Pan and R. Teodorescu, NVSleep: Using non-volatile memory to enable fast sleep/wakeup of idle cores, International Conference on Computer Design, ICCD, 2014.

N. D. Preeti-ranjan-panda, A. Dutt, F. Nicolau, A. Catthoor, E. Vandecappelle et al.,

. Greef, Data Memory Organization and Optimizations in Application-Specific Systems, IEEE Design & Test of Computers, vol.18, pp.56-68, 2001.

S. P. Park, S. Gupta, N. Mojumder, A. Raghunathan, and K. Roy, Future cache design using STT MRAMs for improved energy efficiency: Devices, circuits and architecture, DAC Design Automation Conference, pp.492-497, 2012.

P. Péneau, R. Bouziane, A. Gamatié, E. Rohou, F. Bruguier et al., Loop optimization in presence of STT-MRAM caches: A study of performanceenergy tradeoffs, 26th International Workshop on Power and Timing Modeling, Optimization and Simulation, pp.162-169, 2016.

P. Péneau, D. Novo, F. Bruguier, L. Torres, G. Sassatelli et al., Improving the Performance of STT-MRAM LLC through Enhanced Cache Replacement Policy, ARCS: Architecture of Computing Systems. Vol. LNCS. 10793. Braunschweig, pp.168-180, 2018.

F. Pereira, G. Vieira-leobas, and A. Gamatié, Static Prediction of Silent Stores
URL : https://hal.archives-ouvertes.fr/lirmm-01912634

M. Powell, S. Yang, B. Falsafi, K. Roy, and T. N. Vijaykumar, Gated-Vdd: A Circuit Technique to Reduce Leakage in Deepsubmicron Cache Memories, Proceedings of the 2000 International Symposium on Low Power Electronics and Design. ISLPED '00, pp.90-95, 2000.

Q. Li, MGC: Multiple graph-coloring for non-volatile memory based hybrid Scratchpad Memory, Workshop on Interaction between Compilers and Computer Architectures (INTERACT), 2012.

K. Qiu, J. Luo, Z. Gong, W. Zhang, J. Wang et al., Refresh-aware loop scheduling for high performance low power volatile STT-RAM, 34th IEEE International Conference on Computer Design, ICCD 2016, pp.209-216, 2016.

B. Quan, T. Zhang, T. Chen, and J. Wu, Prediction table based management policy for STT-RAM and SRAM hybrid cache, 2012 7th International Conference on Computing and Convergence Technology (ICCCT), pp.1092-1097, 2012.

J. Ramanujam, J. Hong, M. Kandemir, and A. Narayan, Reducing memory requirements of nested loops for embedded systems, Proceedings of the 38th

, Design Automation Conference, pp.359-364, 2001.

N. D. Rizzo, M. Deherrera, J. Janesky, B. Engel, J. Slaughter et al., Thermally activated magnetization reversal in submicron magnetic tunnel junctions for magnetoresistive random access memory, Appl. Phys. Lett, vol.80, 2002.

G. Rodríguez, J. Touriño, and M. T. Kandemir, Volatile STT-RAM Scratchpad Design and Data Allocation for Low Energy, vol.38, pp.1-38, 2014.

. Samsung, Exynos Octa 5422, 2015.

S. Cortus, APS25s+ -Enhanced Performance Embedded Microcontroller With Leading Code Density, 2017.

A. Seznec and P. Michaud, A case for (partially) TAgged GEometric history length branch prediction, Journal of Instruction Level Parallelism, vol.8, 2006.

C. W. Smullen, V. Mohan, A. Nigam, S. Gurumurthi, and M. R. Stan, Relaxing Non-volatility for Fast and Energyefficient STT-RAM Caches, Int. Symp. on High Performance Computer Architecture (HPCA), pp.978-979, 2011.

C. W. Smullen, V. Mohan, A. Nigam, S. Gurumurthi, and M. R. Stan, Relaxing Non-volatility for Fast and Energyefficient STT-RAM Caches, Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture. HPCA '11, pp.978-979, 2011.

S. Steinke, N. Grunwald, L. Wehmeyer, R. Banakar, M. Balakrishnan et al., Reducing energy consumption by dynamic copying of instructions onto onchip memory, 15th International Symposium on System Synthesis, pp.213-218, 2002.

G. Sun, X. Dong, Y. Xie, J. Li, and Y. Chen, A novel architecture of the 3D stacked MRAM L2 cache for CMPs, 2009 IEEE 15th International Symposium on High Performance Computer Architecture, pp.239-249, 2009.

G. Sun, X. Dong, Y. Xie, J. Li, and Y. Chen, A novel architecture of the 3D stacked MRAM L2 cache for CMPs, International Conference on High-Performance Computer Architecture (HPCA'09), pp.239-249, 2009.

Z. Sun, X. Bi, H. (. Helen, ). Li, W. Wong et al., Multi Retention Level STT-RAM Cache Designs with a Dynamic Refresh Scheme, Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-44

B. Alegre, , pp.329-338, 2011.

Y. Tsai and C. Chen, Energy-Efficient Trace Reuse Cache for Embedded Processors, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.19, pp.1681-1694, 2011.

H. Tseng and D. M. Tullsen, CDTT: Compiler-generated datatriggered threads, International Symposium on High Performance Computer Architecture HPCA, 2014.

H. Tseng and D. M. Tullsen, Data-triggered threads: Eliminating redundant computation, International Conference on High-Performance Computer Architecture (HPCA), 2011.

H. Tseng and D. M. Tullsen, Software data-triggered threads, Conference on Object-Oriented Programming, Systems, Languages, and Applications

J. Wang, X. Dong, Y. Xie, and N. P. Jouppi, WAP: Improving non-volatile cache lifetime by reducing inter-and intra-set write variations, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA, pp.234-245, 2013.

S. Wen, M. Chabbi, and X. Liu, REDSPY: Exploring Value Locality in Software, International Conference on Architectural Support for Programming Languages and Operating Systems ASPLOS, 2017.

R. Wilhelm, J. Engblom, A. Ermedahl, N. Holsti, S. Thesing et al., The Worst-case Execution-time Problem -Overview of Methods and Survey of Tools, In: ACM Trans. Embed. Comput. Syst, vol.7, issue.3, 2008.

X. Wu, J. Li, L. Zhang, E. Speight, R. Rajamony et al., Hybrid Cache Architecture with Disparate Memory Technologies, International Symposium on Computer Architecture (ISCA), 2009.

X. Wu, J. Li, L. Zhang, E. Speight, and Y. Xie, Power and performance of read-write aware hybrid caches with non-volatile memories, Design, Automation & Test in Europe Conf. & Exhibition (DATE), 2009.

Q. Xu, T. Mytkowicz, and N. S. Kim, Approximate Computing: A Survey, IEEE Design & Test, vol.33, pp.8-22, 2016.

H. Yang, G. R. Gao, A. Marquez, G. Cai, and Z. Hu, Power and Energy Impact by Loop Transformations, Proceedings of the Workshop on Compilers and Operating Systems for Low Power, 2000.

S. Yazdanshenas, M. Marzieh-ranjbar-pirbasti, A. Fazeli, and . Patooghy, Coding Last Level STT-RAM Cache for High Endurance and Low Power, IEEE Comput. Archit. Lett, vol.13, issue.2, pp.1556-6056, 2014.

Y. Ye, V. Borkar, and . De, A new technique for standby leakage reduction in high-performance circuits, Symposium on VLSI Circuits. Digest of Technical Papers, pp.40-41, 1998.

C. Zhang, F. Vahid, and W. Najjar, A highly configurable cache architecture for embedded systems, 30th Annual International Symposium on Computer Architecture, pp.136-146, 2003.

J. Zhao, C. Xu, and Y. Xie, Bandwidth-aware reconfigurable cache design with hybrid memory technologies, 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD, pp.48-55, 2011.

J. Zhao, C. Xu, P. Chi, and Y. Xie, Memory and Storage System Design with Nonvolatile Memory Technologies, IPSJ Transactions on System LSI Design Methodology, vol.8, pp.2-11, 2015.

W. Zhao, E. Belhaire, Q. Mistral, C. Chappert, V. Javerliac et al., Macro-model of Spin-Transfer Torque based Magnetic Tunnel Junction device for hybrid Magnetic-CMOS design, 2006 IEEE International Behavioral Modeling and Simulation Workshop, pp.40-43, 2006.

P. Zhou, B. Zhao, J. Yang, and Y. Zhang, Energy reduction for STT-RAM using early write termination, International Conference on ComputerAided Design. ICCAD, 2009.

I. Stirb and H. Ciocârlie, Improving performance and energy consumption with loop fusion optimization and parallelization, 2016 IEEE 17th International Symposium on Computational Intelligence and Informatics (CINTI), pp.99-000104, 2016.