Skip to main content

Advertisement

Log in

Worst-Case Finish Time Analysis for DAG-Based Applications in the Presence of Transient Faults

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Tasks in hard real-time systems are required to meet preset deadlines, even in the presence of transient faults, and hence the analysis of worst-case finish time (WCFT) must consider the extra time incurred by re-executing tasks that were faulty. Existing solutions can only estimate WCFT and usually result in significant under- or over-estimation. In this work, we conclude that a sufficient and necessary condition of a task set experiencing its WCFT is that its critical task incurs all expected transient faults. A method is presented to identify the critical task and WCFT in O(|V | + |E|) where |V | and |E| are the number of tasks and dependencies between tasks, respectively. This method finds its application in testing the feasibility of directed acyclic graph (DAG) based task sets scheduled in a wide variety of fault-prone multi-processor systems, where the processors could be either homogeneous or heterogeneous, DVS-capable or DVS-incapable, etc. The common practices, which require the same time complexity as the proposed critical-task method, could either underestimate the worst case by up to 25%, or overestimate by 13%. Based on the proposed critical-task method, a simulated-annealing scheduling algorithm is developed to find the energy efficient fault-tolerant schedule for a given DAG task set. Experimental results show that the proposed critical-task method wins over a common practice by up to 40% in terms of energy saving.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Wei T, Mishra P, Wu K, Zhou J. Quasi-static fault-tolerant scheduling schemes for energy-efficient hard real-time systems. J. Systems and Software, 2012, 85(6): 1386–1399.

    Article  Google Scholar 

  2. Liu C L, Layland J W. Scheduling algorithms for multiprogramming in a hard-real-time environment. Journal of the ACM (JACM), 1973, 20(1): 46–61.

    Article  MathSciNet  MATH  Google Scholar 

  3. Kopetz H, Grunsteidl G. TTP — A protocol for faulttolerant real-time systems. Computer, 1994, 27(1): 14–23.

    Article  Google Scholar 

  4. Chevochot P, Puaut I. Scheduling fault-tolerant distributed hard real-time tasks independently of the replication strategies. In Proc. the 6th Int. Conf. Real-Time Computing Systems and Applications, Dec. 1999, pp.356-363.

  5. Dima C, Girault A, Lavarenne C, Sorel Y. Off-line realtime fault-tolerant scheduling. In Proc. the 9th Euromicro Workshop on Parallel and Distributed Processing, Feb. 2001, pp.410-417.

  6. Girault A, Kalla H, Sighireanu M, Sorel Y. An algorithm for automatically obtaining distributed and fault-tolerant static schedules. In Proc. International Conference on Dependable Systems and Networks, Jun. 2003, pp.159-168.

  7. Pop P, Izosimov V, Eles P, Peng Z. Design optimization of time- and cost-constrained fault-tolerant embedded systems with checkpointing and replication. IEEE Trans. Very Large Scale Integration Systems, 2009, 17(3): 389–402.

    Article  Google Scholar 

  8. Kandasamy N, Hayes J P, Murray B T. Transparent recovery from intermittent faults in time-triggered distributed systems. IEEE Trans. Computers, 2003, 52(2): 113–125.

    Article  Google Scholar 

  9. Olteanu A, Pop F, Dobre C, Cristea V. A dynamic rescheduling algorithm for resource management in large scale dependable distributed systems. Computers & Mathematics with Applications, 2012, 63(9): 1409–1423.

    Article  MATH  Google Scholar 

  10. Pop F, Dobre C, Cristea V. Performance analysis of grid DAG scheduling algorithms using MONARC simulation tool. In Proc. the 7th ISPDC, Jul. 2008, pp.131-138.

  11. Pop F, Cristea V. Intelligent strategies for DAG scheduling optimization in grid environments. arXiv Preprint, arXiv: 1106.5303, 2011. http://arxiv.org/ftp/arxiv/papers/1106/1106.5303.pdf, August 2015.

  12. Ghosh S, Melhem R, Mosse D. Enhancing real-time schedules to tolerate transient faults. In Proc. the 16th IEEE Real-Time Systems Symposium, Dec. 1995, pp.120-129.

  13. Burns A, Davis R, Punnekkat S. Feasibility analysis of faulttolerant real-time task sets. In Proc. the 8th Euromicro Workshop on Real-Time Systems, Jun. 1996, pp.29-33.

  14. Audsley N, Burns A, Richardson M et al. Applying new scheduling theory to static priority pre-emptive scheduling. Software Engineering Journal, 1993, 8(5): 284–292.

    Article  Google Scholar 

  15. Liberato F, Melhem R, Moss´e D. Tolerance to multiple transient faults for aperiodic tasks in hard real-time systems. IEEE Transactions on Computers, 2000, 49(9): 906–914.

    Article  MathSciNet  Google Scholar 

  16. Aydin H. Exact fault-sensitive feasibility analysis of realtime tasks. IEEE Trans. Computers, 2007, 56(10): 1372–1386.

    Article  MathSciNet  Google Scholar 

  17. Chrobak M, Hurand M, Sgall J. Fast algorithms for testing fault-tolerance of sequenced jobs with deadlines. In Proc. the 28th IEEE RTSS, Dec. 2007, pp.139-148.

  18. Thekkilakattil A, Dobrin R, Punnekkat S et al. Resource augmentation for fault-tolerance feasibility of real-time tasks under error bursts. In Proc. the 20th Int. Conf. Real-Time and Network Systems, Nov. 2012, pp.41-50.

  19. Goddard S. On the management of latency in the synthesis of real-time signal processing systems from processing graphs [Ph.D. Thesis]. The University of North Carolina at Chapel Hill, 1998.

  20. Liu C, Anderson J H. Supporting soft real-time DAG-based systems on multiprocessors with no utilization loss. In Proc. the 31st IEEE RTSS, Nov. 30-Dec. 3, 2010, pp.3-13.

  21. Bauer G, Kopetz H. Transparent redundancy in the timetriggered architecture. In Proc. International Conference on Dependable Systems and Networks, Jun. 2000, pp.5-13.

  22. Bretz E A. By-wire cars turn the corner. IEEE Spectrum, 2001, 38(4): 68–73.

    Article  Google Scholar 

  23. Kopetz H. Why time-triggered architectures will succeed in large hard real-time systems. In Proc. the 5th IEEE FTDCS, Aug. 1995, pp.2-9.

  24. Obermaisser R. Event-Triggered and Time-Triggered Control Paradigms. Springer US, 2004.

  25. Poledna S. Fault-Tolerant Real-Time Systems: The Problem of Replica Determinism. Springer US, 1996.

  26. Suri N, Walter C J, Hugue M M. Advances in ULTRADependable Distributed Systems. Los Alamitos, CA, USA: IEEE Computer Society Press, 1994.

    Google Scholar 

  27. Pop P, Eles P, Peng Z. Schedulability analysis for systems with data and control dependencies. In Proc. the 12th Euromicro Conf. Real-Time Systems, June 2000, pp.201-208.

  28. Laplante P A. Real-Time Systems Design and Analysis. John Wiley & Sons, 2004.

  29. Liu Y, Liang H, Wu K. Scheduling for energy efficiency and fault tolerance in hard real-time systems. In Proc. the DATE, Mar. 2010, pp.1444-1449.

  30. Manber U. Introduction to Algorithms: A Creative Approach. Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc., 1989.

    MATH  Google Scholar 

  31. Luo J, Jha N K. Static and dynamic variable voltage scheduling algorithms for real-time heterogeneous distributed embedded systems. In Proc. the 15th International Conference on VLSI Design, Jan. 2002, pp.719-726.

  32. Liu Y, Mok A K. An integrated approach for applying dynamic voltage scaling to hard real-time systems. In Proc. the 9th IEEE RTAS, May 2003, pp.116-123.

  33. Cho Y, Chang N, Chakrabarti C, Vrudhula S. High-level power management of embedded systems with applicationspecific energy cost functions. In Proc. the 43rd Annual Design Automation Conference, July 2006, pp.568-573.

  34. Kianzad V, Bhattacharyya S S, Qu G. CASPER: An integrated energy-driven approach for task graph scheduling on distributed embedded systems. In Proc. the 16th IEEE ASAP, Jul. 2005, pp.191-197.

  35. Hua S, Qu G. Power minimization techniques on distributed real-time systems by global and local slack management. In Proc. the 10th ASP-DAC, Jan. 2005, pp.830-835.

  36. Schmitz M T, Al-Hashimi B M, Eles P. Iterative schedule optimization for voltage scalable distributed embedded systems. ACM Trans. Embedded Computing Systems, 2004, 3(1): 182–217.

    Article  Google Scholar 

  37. Lin M, Ding C. Parallel genetic algorithms for DVS scheduling of distributed embedded systems. In Proc. the 3rd HPCC, Sept. 2007, pp.180-191.

  38. Huang J, Buckl C, Raabe A, Knoll A. Energy-aware task allocation for network-on-chip based heterogeneous multiprocessor systems. In Proc. the 19th PDP, Feb. 2011, pp.447-454.

  39. Hung C M, Chen J J, Kuo T W. Energy-efficient real-time task scheduling for a DVS system with a non-DVS processing element. In Proc. the 27th IEEE International Real-Time Systems Symposium, Dec. 2006, pp.303-312.

  40. Xu R, Melhem R, Mosse D. Energy-aware scheduling for streaming applications on chip multiprocessors. In Proc. the 28th IEEE Int. Real-Time Systems Symp., Dec. 2007, pp.25-38.

  41. Černý V. Thermodynamical approach to the traveling salesman problem: An efficient simulation algorithm. Journal of Optimization Theory and Applications, 1985, 45(1): 41–51.

    Article  MathSciNet  MATH  Google Scholar 

  42. Kirkpatrick S. Optimization by simulated annealing: Quantitative studies. Journal of Statistical Physics, 1984, 34(5/6): 975–986.

    Article  MathSciNet  Google Scholar 

  43. Živojnović V, Velarde J M, Schl¨ager C, Meyr H. DSPstone: A DSP-oriented benchmarking methodology. In Proc. the ICSPAT, Oct. 1994, pp.715-720.

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Kai-Jie Wu or Tong-Quan Wei.

Additional information

A preliminary version of the paper was published in the Proceedings of ASP-DAC 2014.

This work is partially supported by the National High Technology Research and Development 863 Program of China under Grant Nos. 2015AA015304 and 2013AA013202, the National Natural Science Foundation of China under Grant No. 61472052, and Chongqing Research Program under Grant No. cstc2014yykfB40007.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cui, XT., Wu, KJ., Wei, TQ. et al. Worst-Case Finish Time Analysis for DAG-Based Applications in the Presence of Transient Faults. J. Comput. Sci. Technol. 31, 267–283 (2016). https://doi.org/10.1007/s11390-016-1626-6

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-016-1626-6

Keywords

Navigation