Skip to main content

Fault Tolerance in Cloud Manufacturing: An Overview

  • Conference paper
  • First Online:
Mobile Computing, Applications, and Services (MobiCASE 2022)

Abstract

Utilizing edge and cloud computing to empower the profitability of manufacturing is drastically increasing in modern industries. As a result of that, several challenges have raised over the years that essentially require urgent attention. Among these, coping with different faults in edge and cloud computing and recovering from permanent and temporary faults became prominent issues to be solved. In this paper, we focus on the challenges of applying fault tolerance techniques on edge and cloud computing in the context of manufacturing and we investigate the current state of the proposed approaches by categorizing them into several groups. Moreover, we identify critical gaps in the research domain as open research directions.

This research was partially sponsored by the Knowledge Foundation (KKS) under the SACSys project, and by the research center XPRES (Excellence in Production Research) - a strategic research area in Sweden.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Al-Dulaimy, A., Christian, S., Papadopoulos, A.V., Galletta, A., Villari, M., Ashjaei, M.: Tolerancer: a fault tolerance approach for cloud manufacturing environments. In: IEEE 27th International Conference on Emerging Technologies and Factory Automation (ETFA 2022) (2022)

    Google Scholar 

  2. Al-Dulaimy, A., Itani, W., Taheri, J., Shamseddine, M.: bwslicer: a bandwidth slicing framework for cloud data centers. Futur. Gener. Comput. Syst. 112, 767–784 (2020)

    Article  Google Scholar 

  3. Al-Dulaimy, A., Sharma, Y., Khan, M.G., Taheri, J.: Introduction to edge computing. Edge Comput. Models Technol. Appl. 3–25 (2020)

    Google Scholar 

  4. Amazon: manufacturing: simplifying digital transformation. https://aws.amazon.com/manufacturing/ (2022). Accessed 2022

  5. Amoon, M.: A framework for providing a hybrid fault tolerance in cloud computing. In: 2015 Science and Information Conference (SAI), pp. 844–849. IEEE (2015)

    Google Scholar 

  6. Amruthnath, N., Gupta, T.: A research study on unsupervised machine learning algorithms for early fault detection in predictive maintenance. In: 2018 5th International Conference on Industrial Engineering and Applications (ICIEA), pp. 355–361. IEEE (2018)

    Google Scholar 

  7. Bakhshi, Z., Rodriguez-Navas, G., Hansson, H.: Dependable fog computing: a systematic literature review. In: 2019 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 395–403. IEEE (2019)

    Google Scholar 

  8. Du, W., et al.: Fault-tolerating edge computing with server redundancy based on a variant of group degree centrality. In: Kafeza, E., Benatallah, B., Martinelli, F., Hacid, H., Bouguettaya, A., Motahari, H. (eds.) ICSOC 2020. LNCS, vol. 12571, pp. 198–214. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-65310-1_16

    Chapter  Google Scholar 

  9. Egwutuoha, I.P., Levy, D., Selic, B., Chen, S.: A survey of fault tolerance mechanisms and checkpoint/restart implementations for high performance computing systems. J. Supercomput. 65(3), 1302–1326 (2013). https://doi.org/10.1007/s11227-013-0884-0

    Article  Google Scholar 

  10. Google: Google cloud for manufacturing. https://cloud.google.com/solutions/manufacturing/ (2022). Accessed 2022

  11. Hu, X., Li, Y., Jia, L., Qiu, M.: A novel two-stage unsupervised fault recognition framework combining feature extraction and fuzzy clustering for collaborative AIoT. IEEE Trans. Industr. Inf. 18(2), 1291–1300 (2021)

    Article  Google Scholar 

  12. Javadi, B., Thulasiraman, P., Buyya, R.: Enhancing performance of failure-prone clusters by adaptive provisioning of cloud resources. J. Supercomput. 63(2), 467–489 (2013)

    Article  Google Scholar 

  13. Javed, A., Heljanko, K., Buda, A., Främling, K.: Cefiot: a fault-tolerant IoT architecture for edge and cloud. In: 2018 IEEE 4th World Forum on Internet of Things (WF-IoT), pp. 813–818. IEEE (2018)

    Google Scholar 

  14. Karhula, P., Janak, J., Schulzrinne, H.: Checkpointing and migration of IoT edge functions. In: Proceedings of the 2nd International Workshop on Edge Systems, Analytics and Networking, pp. 60–65 (2019)

    Google Scholar 

  15. Klein, C., et al.: Improving cloud service resilience using brownout-aware load-balancing. In: IEEE 33rd International Symposium on Reliable Distributed Systems (SRDS), pp. 31–40. IEEE, New York (2014). https://doi.org/10.1109/SRDS.2014.14

  16. Microsoft: Introducing microsoft cloud for manufacturing. https://www.vmware.com/topics/glossary/content/network-configuration.html (2022). Accessed 2022

  17. Moreno, G.A., Papadopoulos, A.V., Angelopoulos, K., Cámara, J., Schmerl, B.: Comparing model-based predictive approaches to self-adaptation: Cobra and PLA. In: 12th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS), pp. 42–53 (2017). https://doi.org/10.1109/SEAMS.2017.2

  18. Mukwevho, M.A., Celik, T.: Toward a smart cloud: a review of fault-tolerance methods in cloud systems. IEEE Trans. Serv. Comput. 14(2), 589–605 (2018)

    Article  Google Scholar 

  19. Papadopoulos, A.V., et al.: Control-based load-balancing techniques: analysis and performance evaluation via a randomized optimization approach. Control. Eng. Pract. 52, 24–34 (2016). https://doi.org/10.1016/j.conengprac.2016.03.020

    Article  Google Scholar 

  20. Ray, B., Saha, A., Khatua, S., Roy, S.: Proactive fault-tolerance technique to enhance reliability of cloud service in cloud federation environment. IEEE Trans. Cloud Comput. 10(2), 957–971 (2020)

    Article  Google Scholar 

  21. Scheuner, J., Leitner, P.: Function-as-a-service performance evaluation: a multivocal literature review. J. Syst. Softw. 170, 110708 (2020)

    Article  Google Scholar 

  22. Shahid, M.A., Islam, N., Alam, M.M., Mazliham, M., Musa, S.: Towards resilient method: an exhaustive survey of fault tolerance methods in the cloud computing environment. Comput. Sci. Rev. 40, 100398 (2021)

    Article  Google Scholar 

  23. Shahid, M.A., Islam, N., Alam, M.M., Su’ud, M.M., Musa, S.: A comprehensive study of load balancing approaches in the cloud computing environment and a novel fault tolerance approach. IEEE Access 8, 130500–130526 (2020)

    Article  Google Scholar 

  24. Sharma, Y., Si, W., Sun, D., Javadi, B.: Failure-aware energy-efficient VM consolidation in cloud computing systems. Futur. Gener. Comput. Syst. 94, 620–633 (2019)

    Article  Google Scholar 

  25. Souza, A., Papadopoulos, A.V., Tomás Bolivar, L., Gilbert, D., Tordsson, J.: Hybrid adaptive checkpointing for virtual machine fault tolerance. In: IEEE International Conference on Cloud Engineering (IC2E), pp. 12–22 (2018). https://doi.org/10.1109/IC2E.2018.00023

  26. Tao, F., Zhang, L., Liu, Y., Cheng, Y., Wang, L., Xu, X.: Manufacturing service management in cloud manufacturing: overview and future research directions. J. Manufact. Sci. Eng. 137(4) (2015)

    Google Scholar 

  27. Theodoropoulos, T., Makris, A., Violos, J., Tserpes, K.: An automated pipeline for advanced fault tolerance in edge computing infrastructures. In: Proceedings of the 2nd Workshop on Flexible Resource and Application Management on the Edge, pp. 19–24 (2022)

    Google Scholar 

  28. Thieme, C.A., Mosleh, A., Utne, I.B., Hegde, J.: Incorporating software failure in risk analysis-part 1: software functional failure mode classification. Reliab. Eng. Syst. Saf. 197, 106803 (2020)

    Article  Google Scholar 

  29. Tuli, S., Casale, G., Jennings, N.R.: Pregan: preemptive migration prediction network for proactive fault-tolerant edge computing. In: IEEE INFOCOM 2022-IEEE Conference on Computer Communications, pp. 670–679. IEEE (2022)

    Google Scholar 

  30. vmWARE: What is network configuration. https://www.microsoft.com/en-us/industry/manufacturing/microsoft-cloud-for-manufacturing (2022). Accessed 2022

  31. Wu, Y., Peng, G., Wang, H., Zhang, H.: A two-stage fault tolerance method for large-scale manufacturing network. IEEE Access 7, 81574–81592 (2019)

    Article  Google Scholar 

  32. Xing, D., Chen, R., Qi, L., Zhao, J., Wang, Y.: Multi-source fault identification based on combined deep learning. In: MATEC Web of Conferences, vol. 309, p. 03037. EDP Sciences (2020)

    Google Scholar 

  33. Zhou, A., et al.: Cloud service reliability enhancement via virtual machine placement optimization. IEEE Trans. Serv. Comput. 10(6), 902–913 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Auday Al-Dulaimy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Al-Dulaimy, A., Ashjaei, M., Behnam, M., Nolte, T., Papadopoulos, A.V. (2023). Fault Tolerance in Cloud Manufacturing: An Overview. In: Taheri, J., Villari, M., Galletta, A. (eds) Mobile Computing, Applications, and Services. MobiCASE 2022. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 495. Springer, Cham. https://doi.org/10.1007/978-3-031-31891-7_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-31891-7_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-31890-0

  • Online ISBN: 978-3-031-31891-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics