Abstract
As heterogeneous systems have been deployed widely in various fields, the reliability become the major concern. Thereby, fault tolerance receives a great deal of attention in both industry and academia, especially for safety critical systems. Such systems require that tasks need to be carried out correctly in a given deadline even when an error occurs. Therefore, it is imperative to support fault-tolerance capability for systems. Scheduling is an efficient approach to achieving fault tolerance by allocating multiple copies of tasks on processors. Existing fault-tolerant scheduling algorithms realize fault tolerance without energy limit. To address this issue, this paper proposes an energy-aware fault-tolerant scheduling algorithm DRB-FTSA-E. The algorithm adopts the active replication strategy and uses a high utilization of energy consumption to complete a set of tasks with given reliability and time constraints. It finds out all schemes that meet time and system reliability constraints, and chooses the scheme with the maximum utilization of energy consumption as the final scheduling scheme. Experimental simulation results show that the proposed algorithm can effectively achieve the maximum utilization of energy consumption while meeting the reliability and time constraints.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Benoit, A., Hakem, M., Robert, Y.: Fault tolerant scheduling of precedence task graphs on heterogeneous platforms. In: IEEE International Symposium on Parallel and Distributed Processing, pp. 1–8 (2008)
Broberg, J., Ståhl, P.: Dynamic fault tolerance and task scheduling in distributed systems (2016)
Cui, X.T., Wu, K.J., Wei, T.Q., Sha, H.M.: Worst-case finish time analysis for dag-based applications in the presence of transient faults. J. Comput. Sci. Technol. 31(2), 267–283 (2016)
Deng, F., Tian, Y., Zhu, R., Chen, Z.: Fault-tolerant approach for modular multilevel converters under submodule faults. IEEE Trans. Ind. Electron. 63(11), 7253–7263 (2016)
Girault, A., Kalla, H., Sighireanu, M., Sorel, Y.: An algorithm for automatically obtaining distributed and fault-tolerant static schedules. In: 2003 Proceedings of the International Conference on Dependable Systems and Networks, pp. 159–168 (2006)
Guo, H., Wang, Z.G., Zhou, J.L.: Load balancing based process scheduling with fault-tolerance in heterogeneous distributed system. Chin. J. Comput. 28(11), 1807–1816 (2005)
Guo, Y., Zhu, D., Aydin, H.: Generalized standby-sparing techniques for energy-efficient fault tolerance in multiprocessor real-time systems. In: IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, pp. 62–71 (2013)
Guo, Y., Zhu, D., Aydin, H., Yang, L.T., Member, S., Antonio, S.: Energy-efficient scheduling of primary/backup tasks in multiprocessor real-time systems (extended version) (2013)
Haque, M.A., Aydin, H., Zhu, D.: On reliability management of energy-aware real-time systems through task replication. IEEE Trans. Parallel Distrib. Syst. 28(3), 813–825 (2017)
Iyer, R.K.: Measurement and modeling of computer reliability as affected by system activity. ACM Trans. Comput. Syst. 4(3), 214–237 (1986)
Levitin, G., Xing, L., Dai, Y.: Optimizing dynamic performance of multistate systems with heterogeneous 1-out-of-n warm standby components. IEEE Trans. Syst. Man Cybern. Syst. PP(99), 1–10 (2016)
Liu, J., Wang, S., Zhou, A., Kumar, S., Yang, F., Buyya, R.: Using proactive fault-tolerance approach to enhance cloud service reliability. IEEE Trans. Cloud Comput. PP(99), 1 (2016)
Luo, W., Yang, F., Pang, L., Qin, X.: Fault-tolerant scheduling based on periodic tasks for heterogeneous systems. In: Yang, L.T., Jin, H., Ma, J., Ungerer, T. (eds.) ATC 2006. LNCS, vol. 4158, pp. 571–580. Springer, Heidelberg (2006). https://doi.org/10.1007/11839569_56
Song, Y.D., Yuan, X.: Low-cost adaptive fault-tolerant approach for semi-active suspension control of high speed trains. IEEE Trans. Ind. Electron. PP(99), 1 (2016)
Sridharan, R., Mahapatra, R.: Reliability aware power management for dual-processor real-time embedded systems. In: Design Automation Conference, pp. 819–824 (2010)
Tabbaa, N., Entezari-Maleki, R., Movaghar, A.: A fault tolerant scheduling algorithm for dag applications in cluster environments. Commun. Comput. Inf. Sci. 188, 189–199 (2011)
Topcuouglu, H., Hariri, S., Wu, M.Y.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)
Treaster, M.: A survey of fault-tolerance and fault-recovery techniques in parallel systems. ACM Computing Research Repository (CoRR 501002, 1–11) (2005)
Wei, M., Liu, J., Li, T., Xu, X., Hu, W., Zhao, D.: Fault-tolerant scheduling of real-time tasks on heterogeneous systems. In: 2017 12th IEEE Conference on Industrial Electronics and Applications (ICIEA), pp. 1006–1011. IEEE (2017)
Xie, G.Q., Ren-Fa, L.I., Liu, L., Yang, F.: Dag reliability model and fault-tolerant algorithm for heterogeneous distributed systems. Chin. J. Comput. 36(10), 2019–2032 (2013)
Zhao, B., Aydin, H., Zhu, D.: Shared recovery for energy efficiency and reliability enhancements in real-time applications with precedence constraints. ACM Trans. Des. Autom. Electron. Syst. 18(2), 1–21 (2013)
Zhao, L., Ren, Y., Yang, X., Sakurai, K.: Fault-tolerant scheduling with dynamic number of replicas in heterogeneous systems. In: IEEE International Conference on High Performance Computing and Communications, pp. 434–441 (2011)
Zhu, D., Aydin, H.: Reliability-aware energy management for periodic real-time tasks. In: IEEE Real Time and Embedded Technology and Applications Symposium, pp. 225–235 (2007)
Acknowledgment
The authors would like to express their sincere gratitude to the editors and the referees. This work was supported by the National Natural Science Foundation of China (Grant Nos. 61602350, 61602349), the Open Foundation of Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System (2016znss26C).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Guo, T., Liu, J., Hu, W., Wei, M. (2018). Energy-Aware Fault-Tolerant Scheduling Under Reliability and Time Constraints in Heterogeneous Systems. In: Huang, DS., Gromiha, M., Han, K., Hussain, A. (eds) Intelligent Computing Methodologies. ICIC 2018. Lecture Notes in Computer Science(), vol 10956. Springer, Cham. https://doi.org/10.1007/978-3-319-95957-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-95957-3_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95956-6
Online ISBN: 978-3-319-95957-3
eBook Packages: Computer ScienceComputer Science (R0)