Abstract
By integrating Internet of Things (IoT) capabilities to sense real-time conditions in the physical environment, traditional Business Process Management (BPM) has the potential to become more flexible and adaptive. However, the integration of BPM and IoT faces challenges such as programming mechanism mismatches, resource management mechanism mismatches, and adaptive mechanism mismatches. This research considers IoT service-based technology as an effective approach to integrate BPM and IoT. The IoT service must be calculable, composable, bindable, and fault-tolerant. When IoT services run on Apache Flink, the native fault tolerance mechanism may not meet the fault tolerance needs of IoT services due to high-speed fluctuation characteristics of IoT service data sources. Additionally, traditional static checkpoint fault-tolerant mechanisms may not balance runtime overhead and recovery delay optimally. This paper proposes an on-demand dynamic checkpoint fault-tolerant method that calculates the recovery delay in real-time based on data fluctuation rates and actively triggers the checkpoint operation when the user threshold is reached. Experiments show that the proposed method improves system efficiency by up to 11.9% compared to the static checkpoint mechanism.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Stankovic, J.A.: Research directions for the Internet of Things. IEEE Internet Things J. 1(1), 3ā9 (2014)
Stoyanova, M., Nikoloudakis, Y., Panagiotakis, S., Pallis, E., Markakis, E.K.: A survey on the internet of things (IoT) forensics: challenges, approaches, and open issues. IEEE Commun. Surv. Tutor. 22(2), 1191ā1221 (2020)
Weske, M.: Business Process Management: Concepts, Languages, Architectures. Springer, Heidelberg (2019). https://doi.org/10.1007/978-3-662-59432-2
Gruhn, V., et al.: BRIBOT: towards a service-based methodology for bridging business processes and IoT big data. In: Service-Oriented Computing: 19th International Conference (ICSOC), pp. 597ā611 (2021)
Zhang, J., Zhou, A., Sun, Q., Wang, S., Yang, F.: Overview on fault tolerance strategies of composite service in service computing. Wirel. Commun. Mob. Comput. (2018)
Wang, S., Huang, L., Sun, L., Hsu, C.H., Yang, F.: Efficient and reliable service selection for heterogeneous distributed software systems. Futur. Gener. Comput. Syst. 74, 158ā167 (2017)
Liu, A., Li, Q., Huang, L., Xiao, M.: FACTS: a framework for fault-tolerant composition of transactional web services. IEEE Trans. Serv. Comput. 3(1), 46ā59 (2009)
Erradi, A., Maheshwari, P., Tosic, V.: Recovery policies for enhancing web services reliability. In: 2006 IEEE International Conference on Web Services (ICWS 2006), pp. 189ā196. IEEE (2006)
Wang, S., Lei, T., Zhang, L., Hsu, C.H., Yang, F.: Offloading mobile data traffic for QoS-aware service provision in vehicular cyber-physical systems. Futur. Gener. Comput. Syst. 61, 118ā127 (2016)
Angarita, R., Rukoz, M., Cardinale, Y.: Modeling dynamic recovery strategy for composite web services execution. World Wide Web 19, 89ā109 (2016)
Gupta, S., Bhanodia, P.: A fault tolerant mechanism for composition of web services using subset replacement. Int. J. Adv. Res. Comput. Commun. Eng. 2(8), 3080ā3085 (2013)
Vargas-Santiago, M., HernĆ”ndez, S.E.P., Morales-Rosales, L.A., Kacem, H.H.: Survey on web services fault tolerance approaches based on check-pointing mechanisms. J. Softw. 12(7), 507ā525 (2017)
Mansour, H.E., Dillon, T.: Dependability and rollback recovery for composite web services. IEEE Trans. Serv. Comput. 4(4), 328ā339 (2010)
Chiu, L.Y., Fan, S., Liu, Y., et al.: Providing a fault tolerant system in a loosely-coupled cluster environment using application checkpoints and logs. U.S. Patent 9,098,439 (2015)
Chandy, K.M., Lamport, L.: Distributed snapshots: determining global states of distributed systems. ACM Trans. Comput. Syst. (TOCS). 3(1), 63ā75 (1985)
Young, J.W.: A first order approximation to the optimum checkpoint interval. Commun. ACM 17(9), 530ā531 (1974)
Daly, J.T.: A higher order estimate of the optimum checkpoint interval for restart dumps. Futur. Gener. Comput. Syst. 22(3), 303ā312 (2006)
Chen, N., Ren, S.: Adaptive optimal checkpoint interval and its impact on systemās overall quality in soft real-time applications. In: Proceedings of the 2009 ACM Symposium on Applied Computing, pp. 1015ā1020 (2009)
Jin, H., Chen, Y., Zhu, H., Sun, X. H.: Optimizing HPC fault-tolerant environment: an analytical approach. In: 2010 39th International Conference on Parallel Processing, pp. 525ā534. IEEE (2010)
Punnekkat, S., Burns, A., Davis, R.: Analysis of checkpointing for real-time systems. Real-Time Syst. 20(1), 83ā102 (2001)
Zhuang, Y., Wei, X., Li, H., Wang, Y., He, X.: An optimal checkpointing model with online OCI adjustment for stream processing applications. In: 2018 27th International Conference on Computer Communication and Networks (ICCCN), pp. 1ā9. IEEE (2018)
Jayasekara, S., Harwood, A., Karunasekera, S.: A utilization model for optimization of checkpoint intervals in distributed stream processing systems. Futur. Gener. Comput. Syst. 110, 68ā79 (2020)
Geldenhuys, M.K., Thamsen, L., Kao, O.: Chiron: optimizing fault tolerance in QoS-aware distributed stream processing jobs. In: 2020 IEEE International Conference on Big Data (Big Data), pp. 434ā440. IEEE (2020)
Salama, A., Binnig, C., Kraska, T., Zamanian, E.: Cost-based fault-tolerance for parallel data processing. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 285ā297 (2015)
Acknowledgement
This work is supported by the International Cooperation and Exchange Program of National Natural Science Foundation of China (No. 62061136006).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Bai, W., Fang, J., Chang, W. (2023). IoT Service Runtime Fault Tolerance Mechanism Based on Flink Dynamic Checkpoint. In: Wang, Z., Wang, S., Xu, H. (eds) Service Science. ICSS 2023. Communications in Computer and Information Science, vol 1844. Springer, Singapore. https://doi.org/10.1007/978-981-99-4402-6_7
Download citation
DOI: https://doi.org/10.1007/978-981-99-4402-6_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4401-9
Online ISBN: 978-981-99-4402-6
eBook Packages: Computer ScienceComputer Science (R0)