Abstract
Utilizing edge and cloud computing to empower the profitability of manufacturing is drastically increasing in modern industries. As a result of that, several challenges have raised over the years that essentially require urgent attention. Among these, coping with different faults in edge and cloud computing and recovering from permanent and temporary faults became prominent issues to be solved. In this paper, we focus on the challenges of applying fault tolerance techniques on edge and cloud computing in the context of manufacturing and we investigate the current state of the proposed approaches by categorizing them into several groups. Moreover, we identify critical gaps in the research domain as open research directions.
This research was partially sponsored by the Knowledge Foundation (KKS) under the SACSys project, and by the research center XPRES (Excellence in Production Research) - a strategic research area in Sweden.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Al-Dulaimy, A., Christian, S., Papadopoulos, A.V., Galletta, A., Villari, M., Ashjaei, M.: Tolerancer: a fault tolerance approach for cloud manufacturing environments. In: IEEE 27th International Conference on Emerging Technologies and Factory Automation (ETFA 2022) (2022)
Al-Dulaimy, A., Itani, W., Taheri, J., Shamseddine, M.: bwslicer: a bandwidth slicing framework for cloud data centers. Futur. Gener. Comput. Syst. 112, 767–784 (2020)
Al-Dulaimy, A., Sharma, Y., Khan, M.G., Taheri, J.: Introduction to edge computing. Edge Comput. Models Technol. Appl. 3–25 (2020)
Amazon: manufacturing: simplifying digital transformation. https://aws.amazon.com/manufacturing/ (2022). Accessed 2022
Amoon, M.: A framework for providing a hybrid fault tolerance in cloud computing. In: 2015 Science and Information Conference (SAI), pp. 844–849. IEEE (2015)
Amruthnath, N., Gupta, T.: A research study on unsupervised machine learning algorithms for early fault detection in predictive maintenance. In: 2018 5th International Conference on Industrial Engineering and Applications (ICIEA), pp. 355–361. IEEE (2018)
Bakhshi, Z., Rodriguez-Navas, G., Hansson, H.: Dependable fog computing: a systematic literature review. In: 2019 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 395–403. IEEE (2019)
Du, W., et al.: Fault-tolerating edge computing with server redundancy based on a variant of group degree centrality. In: Kafeza, E., Benatallah, B., Martinelli, F., Hacid, H., Bouguettaya, A., Motahari, H. (eds.) ICSOC 2020. LNCS, vol. 12571, pp. 198–214. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-65310-1_16
Egwutuoha, I.P., Levy, D., Selic, B., Chen, S.: A survey of fault tolerance mechanisms and checkpoint/restart implementations for high performance computing systems. J. Supercomput. 65(3), 1302–1326 (2013). https://doi.org/10.1007/s11227-013-0884-0
Google: Google cloud for manufacturing. https://cloud.google.com/solutions/manufacturing/ (2022). Accessed 2022
Hu, X., Li, Y., Jia, L., Qiu, M.: A novel two-stage unsupervised fault recognition framework combining feature extraction and fuzzy clustering for collaborative AIoT. IEEE Trans. Industr. Inf. 18(2), 1291–1300 (2021)
Javadi, B., Thulasiraman, P., Buyya, R.: Enhancing performance of failure-prone clusters by adaptive provisioning of cloud resources. J. Supercomput. 63(2), 467–489 (2013)
Javed, A., Heljanko, K., Buda, A., Främling, K.: Cefiot: a fault-tolerant IoT architecture for edge and cloud. In: 2018 IEEE 4th World Forum on Internet of Things (WF-IoT), pp. 813–818. IEEE (2018)
Karhula, P., Janak, J., Schulzrinne, H.: Checkpointing and migration of IoT edge functions. In: Proceedings of the 2nd International Workshop on Edge Systems, Analytics and Networking, pp. 60–65 (2019)
Klein, C., et al.: Improving cloud service resilience using brownout-aware load-balancing. In: IEEE 33rd International Symposium on Reliable Distributed Systems (SRDS), pp. 31–40. IEEE, New York (2014). https://doi.org/10.1109/SRDS.2014.14
Microsoft: Introducing microsoft cloud for manufacturing. https://www.vmware.com/topics/glossary/content/network-configuration.html (2022). Accessed 2022
Moreno, G.A., Papadopoulos, A.V., Angelopoulos, K., Cámara, J., Schmerl, B.: Comparing model-based predictive approaches to self-adaptation: Cobra and PLA. In: 12th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS), pp. 42–53 (2017). https://doi.org/10.1109/SEAMS.2017.2
Mukwevho, M.A., Celik, T.: Toward a smart cloud: a review of fault-tolerance methods in cloud systems. IEEE Trans. Serv. Comput. 14(2), 589–605 (2018)
Papadopoulos, A.V., et al.: Control-based load-balancing techniques: analysis and performance evaluation via a randomized optimization approach. Control. Eng. Pract. 52, 24–34 (2016). https://doi.org/10.1016/j.conengprac.2016.03.020
Ray, B., Saha, A., Khatua, S., Roy, S.: Proactive fault-tolerance technique to enhance reliability of cloud service in cloud federation environment. IEEE Trans. Cloud Comput. 10(2), 957–971 (2020)
Scheuner, J., Leitner, P.: Function-as-a-service performance evaluation: a multivocal literature review. J. Syst. Softw. 170, 110708 (2020)
Shahid, M.A., Islam, N., Alam, M.M., Mazliham, M., Musa, S.: Towards resilient method: an exhaustive survey of fault tolerance methods in the cloud computing environment. Comput. Sci. Rev. 40, 100398 (2021)
Shahid, M.A., Islam, N., Alam, M.M., Su’ud, M.M., Musa, S.: A comprehensive study of load balancing approaches in the cloud computing environment and a novel fault tolerance approach. IEEE Access 8, 130500–130526 (2020)
Sharma, Y., Si, W., Sun, D., Javadi, B.: Failure-aware energy-efficient VM consolidation in cloud computing systems. Futur. Gener. Comput. Syst. 94, 620–633 (2019)
Souza, A., Papadopoulos, A.V., Tomás Bolivar, L., Gilbert, D., Tordsson, J.: Hybrid adaptive checkpointing for virtual machine fault tolerance. In: IEEE International Conference on Cloud Engineering (IC2E), pp. 12–22 (2018). https://doi.org/10.1109/IC2E.2018.00023
Tao, F., Zhang, L., Liu, Y., Cheng, Y., Wang, L., Xu, X.: Manufacturing service management in cloud manufacturing: overview and future research directions. J. Manufact. Sci. Eng. 137(4) (2015)
Theodoropoulos, T., Makris, A., Violos, J., Tserpes, K.: An automated pipeline for advanced fault tolerance in edge computing infrastructures. In: Proceedings of the 2nd Workshop on Flexible Resource and Application Management on the Edge, pp. 19–24 (2022)
Thieme, C.A., Mosleh, A., Utne, I.B., Hegde, J.: Incorporating software failure in risk analysis-part 1: software functional failure mode classification. Reliab. Eng. Syst. Saf. 197, 106803 (2020)
Tuli, S., Casale, G., Jennings, N.R.: Pregan: preemptive migration prediction network for proactive fault-tolerant edge computing. In: IEEE INFOCOM 2022-IEEE Conference on Computer Communications, pp. 670–679. IEEE (2022)
vmWARE: What is network configuration. https://www.microsoft.com/en-us/industry/manufacturing/microsoft-cloud-for-manufacturing (2022). Accessed 2022
Wu, Y., Peng, G., Wang, H., Zhang, H.: A two-stage fault tolerance method for large-scale manufacturing network. IEEE Access 7, 81574–81592 (2019)
Xing, D., Chen, R., Qi, L., Zhao, J., Wang, Y.: Multi-source fault identification based on combined deep learning. In: MATEC Web of Conferences, vol. 309, p. 03037. EDP Sciences (2020)
Zhou, A., et al.: Cloud service reliability enhancement via virtual machine placement optimization. IEEE Trans. Serv. Comput. 10(6), 902–913 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Al-Dulaimy, A., Ashjaei, M., Behnam, M., Nolte, T., Papadopoulos, A.V. (2023). Fault Tolerance in Cloud Manufacturing: An Overview. In: Taheri, J., Villari, M., Galletta, A. (eds) Mobile Computing, Applications, and Services. MobiCASE 2022. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 495. Springer, Cham. https://doi.org/10.1007/978-3-031-31891-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-31891-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31890-0
Online ISBN: 978-3-031-31891-7
eBook Packages: Computer ScienceComputer Science (R0)