Abstract
Tasks in hard real-time systems are required to meet preset deadlines, even in the presence of transient faults, and hence the analysis of worst-case finish time (WCFT) must consider the extra time incurred by re-executing tasks that were faulty. Existing solutions can only estimate WCFT and usually result in significant under- or over-estimation. In this work, we conclude that a sufficient and necessary condition of a task set experiencing its WCFT is that its critical task incurs all expected transient faults. A method is presented to identify the critical task and WCFT in O(|V | + |E|) where |V | and |E| are the number of tasks and dependencies between tasks, respectively. This method finds its application in testing the feasibility of directed acyclic graph (DAG) based task sets scheduled in a wide variety of fault-prone multi-processor systems, where the processors could be either homogeneous or heterogeneous, DVS-capable or DVS-incapable, etc. The common practices, which require the same time complexity as the proposed critical-task method, could either underestimate the worst case by up to 25%, or overestimate by 13%. Based on the proposed critical-task method, a simulated-annealing scheduling algorithm is developed to find the energy efficient fault-tolerant schedule for a given DAG task set. Experimental results show that the proposed critical-task method wins over a common practice by up to 40% in terms of energy saving.
Similar content being viewed by others
References
Wei T, Mishra P, Wu K, Zhou J. Quasi-static fault-tolerant scheduling schemes for energy-efficient hard real-time systems. J. Systems and Software, 2012, 85(6): 1386–1399.
Liu C L, Layland J W. Scheduling algorithms for multiprogramming in a hard-real-time environment. Journal of the ACM (JACM), 1973, 20(1): 46–61.
Kopetz H, Grunsteidl G. TTP — A protocol for faulttolerant real-time systems. Computer, 1994, 27(1): 14–23.
Chevochot P, Puaut I. Scheduling fault-tolerant distributed hard real-time tasks independently of the replication strategies. In Proc. the 6th Int. Conf. Real-Time Computing Systems and Applications, Dec. 1999, pp.356-363.
Dima C, Girault A, Lavarenne C, Sorel Y. Off-line realtime fault-tolerant scheduling. In Proc. the 9th Euromicro Workshop on Parallel and Distributed Processing, Feb. 2001, pp.410-417.
Girault A, Kalla H, Sighireanu M, Sorel Y. An algorithm for automatically obtaining distributed and fault-tolerant static schedules. In Proc. International Conference on Dependable Systems and Networks, Jun. 2003, pp.159-168.
Pop P, Izosimov V, Eles P, Peng Z. Design optimization of time- and cost-constrained fault-tolerant embedded systems with checkpointing and replication. IEEE Trans. Very Large Scale Integration Systems, 2009, 17(3): 389–402.
Kandasamy N, Hayes J P, Murray B T. Transparent recovery from intermittent faults in time-triggered distributed systems. IEEE Trans. Computers, 2003, 52(2): 113–125.
Olteanu A, Pop F, Dobre C, Cristea V. A dynamic rescheduling algorithm for resource management in large scale dependable distributed systems. Computers & Mathematics with Applications, 2012, 63(9): 1409–1423.
Pop F, Dobre C, Cristea V. Performance analysis of grid DAG scheduling algorithms using MONARC simulation tool. In Proc. the 7th ISPDC, Jul. 2008, pp.131-138.
Pop F, Cristea V. Intelligent strategies for DAG scheduling optimization in grid environments. arXiv Preprint, arXiv: 1106.5303, 2011. http://arxiv.org/ftp/arxiv/papers/1106/1106.5303.pdf, August 2015.
Ghosh S, Melhem R, Mosse D. Enhancing real-time schedules to tolerate transient faults. In Proc. the 16th IEEE Real-Time Systems Symposium, Dec. 1995, pp.120-129.
Burns A, Davis R, Punnekkat S. Feasibility analysis of faulttolerant real-time task sets. In Proc. the 8th Euromicro Workshop on Real-Time Systems, Jun. 1996, pp.29-33.
Audsley N, Burns A, Richardson M et al. Applying new scheduling theory to static priority pre-emptive scheduling. Software Engineering Journal, 1993, 8(5): 284–292.
Liberato F, Melhem R, Moss´e D. Tolerance to multiple transient faults for aperiodic tasks in hard real-time systems. IEEE Transactions on Computers, 2000, 49(9): 906–914.
Aydin H. Exact fault-sensitive feasibility analysis of realtime tasks. IEEE Trans. Computers, 2007, 56(10): 1372–1386.
Chrobak M, Hurand M, Sgall J. Fast algorithms for testing fault-tolerance of sequenced jobs with deadlines. In Proc. the 28th IEEE RTSS, Dec. 2007, pp.139-148.
Thekkilakattil A, Dobrin R, Punnekkat S et al. Resource augmentation for fault-tolerance feasibility of real-time tasks under error bursts. In Proc. the 20th Int. Conf. Real-Time and Network Systems, Nov. 2012, pp.41-50.
Goddard S. On the management of latency in the synthesis of real-time signal processing systems from processing graphs [Ph.D. Thesis]. The University of North Carolina at Chapel Hill, 1998.
Liu C, Anderson J H. Supporting soft real-time DAG-based systems on multiprocessors with no utilization loss. In Proc. the 31st IEEE RTSS, Nov. 30-Dec. 3, 2010, pp.3-13.
Bauer G, Kopetz H. Transparent redundancy in the timetriggered architecture. In Proc. International Conference on Dependable Systems and Networks, Jun. 2000, pp.5-13.
Bretz E A. By-wire cars turn the corner. IEEE Spectrum, 2001, 38(4): 68–73.
Kopetz H. Why time-triggered architectures will succeed in large hard real-time systems. In Proc. the 5th IEEE FTDCS, Aug. 1995, pp.2-9.
Obermaisser R. Event-Triggered and Time-Triggered Control Paradigms. Springer US, 2004.
Poledna S. Fault-Tolerant Real-Time Systems: The Problem of Replica Determinism. Springer US, 1996.
Suri N, Walter C J, Hugue M M. Advances in ULTRADependable Distributed Systems. Los Alamitos, CA, USA: IEEE Computer Society Press, 1994.
Pop P, Eles P, Peng Z. Schedulability analysis for systems with data and control dependencies. In Proc. the 12th Euromicro Conf. Real-Time Systems, June 2000, pp.201-208.
Laplante P A. Real-Time Systems Design and Analysis. John Wiley & Sons, 2004.
Liu Y, Liang H, Wu K. Scheduling for energy efficiency and fault tolerance in hard real-time systems. In Proc. the DATE, Mar. 2010, pp.1444-1449.
Manber U. Introduction to Algorithms: A Creative Approach. Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc., 1989.
Luo J, Jha N K. Static and dynamic variable voltage scheduling algorithms for real-time heterogeneous distributed embedded systems. In Proc. the 15th International Conference on VLSI Design, Jan. 2002, pp.719-726.
Liu Y, Mok A K. An integrated approach for applying dynamic voltage scaling to hard real-time systems. In Proc. the 9th IEEE RTAS, May 2003, pp.116-123.
Cho Y, Chang N, Chakrabarti C, Vrudhula S. High-level power management of embedded systems with applicationspecific energy cost functions. In Proc. the 43rd Annual Design Automation Conference, July 2006, pp.568-573.
Kianzad V, Bhattacharyya S S, Qu G. CASPER: An integrated energy-driven approach for task graph scheduling on distributed embedded systems. In Proc. the 16th IEEE ASAP, Jul. 2005, pp.191-197.
Hua S, Qu G. Power minimization techniques on distributed real-time systems by global and local slack management. In Proc. the 10th ASP-DAC, Jan. 2005, pp.830-835.
Schmitz M T, Al-Hashimi B M, Eles P. Iterative schedule optimization for voltage scalable distributed embedded systems. ACM Trans. Embedded Computing Systems, 2004, 3(1): 182–217.
Lin M, Ding C. Parallel genetic algorithms for DVS scheduling of distributed embedded systems. In Proc. the 3rd HPCC, Sept. 2007, pp.180-191.
Huang J, Buckl C, Raabe A, Knoll A. Energy-aware task allocation for network-on-chip based heterogeneous multiprocessor systems. In Proc. the 19th PDP, Feb. 2011, pp.447-454.
Hung C M, Chen J J, Kuo T W. Energy-efficient real-time task scheduling for a DVS system with a non-DVS processing element. In Proc. the 27th IEEE International Real-Time Systems Symposium, Dec. 2006, pp.303-312.
Xu R, Melhem R, Mosse D. Energy-aware scheduling for streaming applications on chip multiprocessors. In Proc. the 28th IEEE Int. Real-Time Systems Symp., Dec. 2007, pp.25-38.
Černý V. Thermodynamical approach to the traveling salesman problem: An efficient simulation algorithm. Journal of Optimization Theory and Applications, 1985, 45(1): 41–51.
Kirkpatrick S. Optimization by simulated annealing: Quantitative studies. Journal of Statistical Physics, 1984, 34(5/6): 975–986.
Živojnović V, Velarde J M, Schl¨ager C, Meyr H. DSPstone: A DSP-oriented benchmarking methodology. In Proc. the ICSPAT, Oct. 1994, pp.715-720.
Author information
Authors and Affiliations
Corresponding authors
Additional information
A preliminary version of the paper was published in the Proceedings of ASP-DAC 2014.
This work is partially supported by the National High Technology Research and Development 863 Program of China under Grant Nos. 2015AA015304 and 2013AA013202, the National Natural Science Foundation of China under Grant No. 61472052, and Chongqing Research Program under Grant No. cstc2014yykfB40007.
Rights and permissions
About this article
Cite this article
Cui, XT., Wu, KJ., Wei, TQ. et al. Worst-Case Finish Time Analysis for DAG-Based Applications in the Presence of Transient Faults. J. Comput. Sci. Technol. 31, 267–283 (2016). https://doi.org/10.1007/s11390-016-1626-6
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-016-1626-6