, PofTarget Scaling Factor w (c) Utot=2.5 65, 1093.
,
Makeflow: A portable abstraction for data intensive computing on clusters, clouds, and grids, In: 1st ACM SWEET SIGMOD, 2012. ,
Kepler: an extensible system for design and execution of scientific workflows, Proc. of 16th SSDBM, pp.423-424, 2004. ,
Peak Power Management to Meet Thermal Design Power in Fault-Tolerant Embedded Systems, IEEE Transactions on Parallel and Distributed Systems, 2018. ,
A Bi-Criteria Scheduling Heuristic for Distributed Embedded Systems under Reliability and Real-Time Constraints, Dependable Systems Networks (DSN), 2004. ,
StarPU: a unified platform for task scheduling on heterogeneous multicore architectures, In: Concur. and Comp.: Pract. and Exp, vol.23, pp.187-198, 2011. ,
URL : https://hal.archives-ouvertes.fr/inria-00384363
Scheduling computational workflows on failure-prone platforms, In: Int. J. of Networking and Computing, vol.6, pp.2-26, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01251939
Energy-Aware Partitioning for Multiprocessor Real-Time Systems, Proc. 17th Int. Symp. Parallel and Distributed Processing. IPDPS '03, 2003. ,
A communication-induced checkpointing protocol that ensures rollback-dependency trackability, Proc. IEEE 27th International Symposium on Fault Tolerant Computing, pp.68-77, 1997. ,
Detecting and Correcting Data Corruption in Stencil Applications through Multivariate Interpolation, In: FTS. IEEE, 2015. ,
Detecting Silent Data Corruption Through Data Dynamic Monitoring for Scientific Applications, SIGPLAN Notices, vol.49, pp.381-382, 2014. ,
Assessing general-purpose algorithms to cope with fail-stop and silent errors, In: ACM Trans. Parallel Computing, vol.3, issue.2, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01066664
Lightweight Silent Data Corruption Detection Based on Runtime Data Analysis for HPC Applications, 2015. ,
Characterization of scientific workflows, Workflows in Support of Large-Scale Science (WORKS), pp.1-10, 2008. ,
Measuring the performance of schedulability tests, vol.1, pp.129-154, 2005. ,
Parallel algorithms for series parallel graphs, Algorithms -ESA'96, pp.277-289, 1996. ,
A Note on the Complexity of Network Reliability Problems, IEEE Trans. Inf. Theory, vol.47, pp.1971-1988, 2004. ,
Algorithm-based fault tolerance applied to high performance computing, J. Parallel Distrib. Comput, vol.69, pp.410-416, 2009. ,
A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems, In: Journal of Parallel and Distributed computing, vol.61, pp.810-837, 2001. ,
Correlation-Aware Heuristics for Evaluating the Distribution of the Longest Path Length of a DAG with Random Weights, IEEE Trans. Parallel Distributed Systems, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01412922
A markov chain monte carlo approach to cost matrix generation for scheduling performance evaluation, 2018 International Conference on High Performance Computing & Simulation (HPCS), pp.460-467, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-02300458
Design for a Soft Error Resilient Dynamic Task-Based Runtime, In: IPDPS. IEEE, pp.765-774, 2015. ,
QoS-Adaptive Approximate Real-Time Computation for Mobility-Aware IoT Lifetime Optimization, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, pp.1-1, 2018. ,
Affinity-driven modeling and scheduling for makespan optimization in heterogeneous multiprocessor systems, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2018. ,
Toward Exascale Resilience: 2014 update, In: Supercomputing frontiers and innovations, vol.1, issue.1, 2014. ,
Computing the expected makespan of task graphs in the presence of silent errors, P2S2'2016, the 9th Int. Workshop on Programming Models and Systems Software for High-End Computing, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01354711
Design and implementation of the ScaLAPACK LU, QR, and Cholesky factorization routines, In: Scientific Programming, vol.5, pp.173-184, 1996. ,
Adaptive Task Checkpointing and Replication: Toward Efficient Fault-Tolerant Grids, In: IEEE Trans. Parallel Distributed Systems, vol.20, pp.180-190, 2009. ,
Approximation Algorithms for Bin Packing: A Survey, pp.46-93, 1997. ,
Scheduling and automatic parallelization, Birkhäuser, pp.978-981, 2000. ,
URL : https://hal.archives-ouvertes.fr/hal-00856645
A Survey of Hard Real-time Scheduling for Multiprocessor Systems, In: ACM Comput. Surv, vol.43, 2011. ,
Pegasus: A framework for mapping complex scientific workflows onto distributed systems, In: Scientific Programming, vol.13, pp.219-237, 2005. ,
Pegasus, a Workflow Management System for Science Automation, Future Generation Computer Systems, vol.46, pp.17-35, 2015. ,
The impact of new technology on soft error rates, International Reliability Physics Symposium. IEEE, pp.5-9, 2011. ,
A survey of graph layout problems, In: ACM Computing Surveys, vol.34, pp.313-356, 2002. ,
The structural cause of file size distributions, pp.361-370, 2001. ,
Scheduling for Parallel Processing, 2009. ,
Dee: A distributed fault tolerant workflow enactment engine for grid computing, International Conference on High Performance Computing and Communications, pp.704-716, 2005. ,
Askalon: A development and grid computing environment for scientific workflows, pp.450-471, 2007. ,
The synergy between power-aware memory systems and processor voltage scaling, International Workshop on Power-Aware Computer Systems, pp.164-179, 2003. ,
Computers and Intractability, a Guide to the Theory of NP-Completeness, 1979. ,
A novel bicriteria scheduling heuristics providing a guaranteed global system failure rate, IEEE Trans. Dependable and Secure Computing, vol.6, pp.241-254, 2009. ,
URL : https://hal.archives-ouvertes.fr/hal-00746768
A Branch-Bound Solution to the General Scheduling Problem, Oper. Res, vol.16, pp.353-361, 1968. ,
Energy-Aware Fault-Tolerant Scheduling Under Reliability and Time Constraints in Heterogeneous Systems, pp.36-46, 2018. ,
Exploiting primary/backup mechanism for energy efficiency in dependable real-time systems, In: Journal of Systems Architecture, vol.78, pp.1383-7621, 2017. ,
Computational complexity of PERT problems, vol.18, pp.139-147, 1988. ,
Checkpointing Workflows: Simulation Code, 2017. ,
Resource-aware Partitioned Scheduling for Heterogeneous Multicore Real-time Systems, Proceedings of the 55th Annual Design Automation Conference. DAC '18, vol.124, pp.1-124, 2018. ,
Heterogeneous real-time systems, 2020. ,
Checkpointing Workflows for Fail-Stop Errors, IEEE Trans. Computers, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01701611
Code to schedule for periodic real-time tasks under reliability constraints with minimal energy consumption, 2019. ,
Energy-aware strategies for reliability-oriented real-time task allocation on heterogeneous platforms, 2020. ,
URL : https://hal.archives-ouvertes.fr/hal-02500381
Energy-aware Fault-tolerant Scheduling for Hard Real-time Systems, 2015. ,
On reliability management of energy-aware real-time systems through task replication, IEEE Transactions on Parallel and Distributed Systems, vol.28, pp.813-825, 2017. ,
Fault-secure scheduling of arbitrary task graphs to multiprocessor systems, Proceeding International Conference on Dependable Systems and Networks. DSN, pp.203-212, 2000. ,
, Fault-Tolerance Techniques for High-Performance Computing. Computer Communications and Networks, 2015.
A Parallel Branch-and-Bound Algorithm for Computing Optimal Task Graph Schedules, Grid and Cooperative Computing: Second International Workshop, GCC 2003, pp.18-25, 2003. ,
Energy-Efficient Fault-Tolerant Mapping and Scheduling on Heterogeneous Multiprocessor Real-Time Systems, In: IEEE Access, vol.6, pp.2169-3536, 2018. ,
Algorithm-Based Fault Tolerance for Matrix Operations, IEEE Trans. Comput, vol.33, pp.518-528, 1984. ,
Grid workflow: a flexible failure handling framework for the grid, Proceedings. 12th IEEE International Symposium on, pp.126-137, 2003. ,
Modeling stream processing applications for dependability evaluation, Dependable Systems Networks (DSN), 2011. ,
Dynamic voltage scaling for systemwide energy minimization in real-time embedded systems, Proceedings of the 2004 international symposium on Low power electronics and design, pp.78-81, 2004. ,
Performance Under Failures of DAG-based Parallel Computing, CCGRID '09, 2009. ,
Worst-Case Performance Bounds for Simple One-Dimensional Packing Algorithms, In: SIAM Journal on Computing, vol.3, pp.299-325, 1974. ,
Characterizing and profiling scientific workflows, Future Generation Computer Systems, vol.29, pp.682-692, 2013. ,
A novel adaptive checkpointing method based on information obtained from workflow structure, Computer Science, vol.17, issue.3, 2016. ,
Fault-tolerant Dynamic Task Graph Scheduling, SC '14, pp.719-730, 2014. ,
Dependable computing and fault-tolerance, In: Digest of Papers FTCS, vol.15, pp.2-11, 1985. ,
On improving fault tolerance for heterogeneous hadoop mapreduce clusters, 2013 International Conference on Cloud Computing and Big Data, pp.38-43, 2013. ,
,
Scheduling algorithms for multiprogramming in a hard-real-time environment, In: Journal of the ACM (JACM), vol.20, pp.46-61, 1973. ,
Top ten exascale research challenges, DOE ASCAC subcommittee report, pp.1-86, 2014. ,
Probability and Computing: Randomized Algorithms and Probabilistic Analysis, 2005. ,
Scheduling under Uncertainty: Bounding the Makespan Distribution, Computational Discrete Mathematics: Advanced Lectures, pp.79-97, 2001. ,
NanoCheckpoints: A Task-Based Asynchronous Dataflow Framework for Efficient and Scalable Checkpoint/Restart, pp.99-102 ,
HEARS: A heterogeneous energy-aware real-time scheduler, In: Microprocessors and Microsystems, vol.72, pp.141-9331, 2020. ,
, In: Architecture Design for Soft Errors, pp.1-41, 2008.
,
Scheduling: Theory, Algorithms, and Systems. 5th, 2016. ,
A mapping algorithm for parallel sparse Cholesky factorization, In: SIAM J. on Scientific Computing, vol.14, pp.1253-1257, 1993. ,
The Complexity of Counting Cuts and of Computing the Probability that a Graph is Connected, In: SIAM J. Comp, vol.12, issue.4, pp.777-788, 1983. ,
A novel fault-tolerant scheduling algorithm for precedence constrained tasks in real-time heterogeneous systems, Parallel Computing, vol.32, pp.331-356, 2006. ,
An efficient fault-tolerant scheduling algorithm for periodic real-time tasks in heterogeneous platforms, 16th IEEE International Symposium on Object/component/service-oriented Real-time distributed Computing, pp.1-7, 2013. ,
, Results with three real frequencies sets
Multiple frequency selection in DVFS-enabled processors to minimize energy consumption, In: Energy-Efficient Distributed Computing Systems, pp.443-463, 2012. ,
Energy-aware scheduling algorithm for time-constrained workflow tasks in DVFS-enabled cloud environment, In: Simulation Modelling Practice and Theory, vol.87, pp.1569-190, 2018. ,
, Combinatorial Optimization: Polyhedra and Efficiency, vol.24, 2003.
The Completion Time of PERT Networks, In: The Journal of the Operational Research Society, vol.34, issue.2, pp.155-158, 1983. ,
Scheduling Task Graphs Optimally with A*, J. Supercomput, vol.51, pp.310-332, 2010. ,
Fault Tolerant Preconditioned Conjugate Gradient for Sparse Linear System Solution, In: ICS. ACM, 2012. ,
Energy-Efficient Multicore Scheduling for Hard Real-Time Systems: A Survey, In: ACM Trans. Embed. Comput. Syst, vol.17, issue.6, 2018. ,
Community resources for enabling research in distributed scientific workflows, 2014 IEEE 10th International Conference on, vol.1, pp.177-184, 2014. ,
Monte Carlo Methods and the PERT Problem, Operations Research, vol.11, issue.5, pp.839-860, 1963. ,
Accurate run-time prediction of performance degradation under frequency scaling, Workshop on Operating Systems Platforms for Embedded Real-Time applications, p.58, 2007. ,
Reliability Aware Power Management for Dual-processor Real-time Embedded Systems, Proceedings of the 47th Design Automation Conference. DAC '10, pp.819-824, 2010. ,
Designing and Modelling Selective Replication for Fault-Tolerant HPC Applications, 2017. ,
CRC-Based Memory Reliability for Task-Parallel HPC Applications, pp.1101-1112, 2016. ,
Reliability-Aware Energy Management in Mixed-Criticality Systems, IEEE Transactions on Sustainable Computing, 2018. ,
A standard task graph set for fair evaluation of multiprocessor scheduling algorithms, In: Journal of Scheduling, vol.5, pp.379-394, 2002. ,
An uncoordinated asynchronous checkpointing model for hierarchical scientific workflows, In: Journal of Computer and System Sciences, vol.76, issue.6, pp.403-415, 2010. ,
Performance-effective and low-complexity task scheduling for heterogeneous computing, IEEE transactions on parallel and distributed systems, vol.13, pp.260-274, 2002. ,
On the Optimum Checkpoint Selection Problem, In: SIAM J. Comput, vol.13, p.3, 1984. ,
STEM: A Thermal-Constrained Real-Time Scheduling for 3D Heterogeneous-ISA Multicore Processors, IEEE Transactions on Computers, vol.67, issue.6, pp.2326-3814, 2018. ,
The Recognition of Series Parallel Digraphs, Proc. of STOC'79, pp.1-12, 1979. ,
Scheduling hard real-time tasks in heterogeneous multiprocessor platforms subject to energy and temperature constraints, 2017. ,
The Complexity of Enumeration and Reliability Problems, In: SIAM J. Comput, vol.8, issue.3, pp.410-421, 1979. ,
Replication-Based Fault-Tolerance for Large-Scale Graph Processing, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp.562-573, 2014. ,
Techniques to reduce the soft error rate of a high-performance microprocessor, In: ACM SIGARCH Computer Architecture News, vol.32, p.264, 2004. ,
Swift: A language for distributed parallel scripting, In: Parallel Computing, vol.37, pp.633-652, 2011. ,
The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud, In: Nucleic acids research, p.328, 2013. ,
Maximizing reliability of energy constrained parallel applications on heterogeneous distributed systems, In: Journal of Computational Science, vol.26, pp.1877-7503, 2018. ,
Energy-Efficient Scheduling Algorithms for Real-Time Parallel Applications on Heterogeneous Distributed Embedded Systems, IEEE Transactions on Parallel and Distributed Systems, vol.28, pp.2161-9883, 2017. ,
Energy-efficient fault-tolerant scheduling of reliable parallel applications on heterogeneous distributed embedded systems, IEEE Transactions on Sustainable Computing, vol.3, pp.167-181, 2018. ,
Scheduling Parallel Applications on Heterogeneous Distributed Systems, 2019. ,
Minimizing energy consumption with reliability goal on heterogeneous embedded systems, Journal of Parallel and Distributed Computing, vol.127, pp.44-57, 2019. ,
An approximation scheme for energy-efficient scheduling of real-time tasks in heterogeneous multiprocessor systems, pp.694-699, 2009. ,
Energy-efficient scheduling for moldable real-time tasks on heterogeneous computing platforms, In: Journal of Systems Architecture, vol.74, pp.1383-7621, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01436209
A C-DAG task model for scheduling complex real-time tasks on heterogeneous platforms: preemption matters, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-01971594
Enabling In-situ Execution of Coupled Scientific Workflow on Multi-core Platform, Proc. 26th IEEE IPDPS, pp.1352-1363, 2012. ,
Energy management under general task-level reliability constraints, In: 18th Real-Time and Embedded Technology and Applications Symposium (RTAS) ,
, , pp.285-294, 2012.
Variation-aware task allocation and scheduling for improving reliability of real-time MPSoCs, In: DATE, pp.171-176, 2018. ,
ASC: Improving spark driver performance with automatic spark checkpoint, 2016 18th International Conference on Advanced Communication Technology (ICACT), pp.607-611, 2016. ,
System-level energy-efficient dynamic task scheduling, Proceedings of the 42nd annual Design Automation Conference, pp.628-631, 2005. ,
Checkpointing Workflows for Fail-Stop Errors, IEEE Transactions on Computers, vol.67, pp.2326-3814, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01701611
A generic approach to scheduling and checkpointing workflows, The International Journal of High Performance Computing Applications, vol.33, pp.1255-1274, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-01798627
, Articles in International Refereed Conferences
Safety Requirements Specification and Verification for Railway Interlocking Systems, 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), vol.1, pp.335-340, 2016. ,
Checkpointing Workflows for Fail-Stop Errors, 2017 IEEE International Conference on Cluster Computing (CLUSTER). Honolulu, pp.487-497, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01701611
A Generic Approach to Scheduling and Checkpointing Workflows, Proceedings of the 47th International Conference on Parallel Processing, vol.28, pp.1-28, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01798627
Improved energy-aware strategies for periodic real-time tasks under reliability constraints, 40th IEEE Real-Time Systems Symposium (RTSS). IEEE, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02056520