Abstract
Existing distributed stream processing systems generally guarantee fault tolerance by switching to standby machines and reprocessing lost data. In edge computing environments, however, we have to duplicate each edge for this conventional approach. This duplication cost increases sharply with expansion in the system scale. To solve this problem, we propose an approach to support approximate fault tolerance without edge duplication. We focus on environmental monitoring applications and utilize the correlation between sensors. In this paper, we assume that each edge estimates missing data from the observed data and aggregates them approximately. We provide a method to estimate the outputs of failed edges taking care of the uncertainty of the processing results at each edge. Our method allows the server to continue processing without waiting for the recovery of failed edges. We also show that the validity of our method by experiments using synthetic data.
This paper is based on results obtained from a project, JPNP16007, commissioned by the New Energy and Industrial Technology Development Organization (NEDO). Also, This work was partly supported by KAKENHI (16H01722 and 20K19804).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
2JCIE-BL Environment Sensor \(|\) OMRON - Americas. https://components.omron.com/product-detail?partId=73064. Accessed 20 Apr 2021
Apache Flink: Stateful Computations over Data Streams. https://flink.apache.org/ . Accessed 20 Apr 2021
Apache Storm. https://storm.apache.org/. Accessed 20 Apr 2021
CRAN - package mvnormtest. https://CRAN.R-project.org/package=mvnormtest. Accessed 20 Apr 2021
Mvnorm function \(|\) R Documentation. https://www.rdocumentation.org/packages/mvtnorm/versions/1.0-11/topics/Mvnorm. Accessed 20 Apr 2021
Spark Streaming \(|\) Apache Spark. https://spark.apache.org/streaming/. Accessed 20 Apr 2021
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006). https://doi.org/10.1007/978-0-387-45528-0
Daiki, T., Kento, S., Yoshiharu, I.: Approximate streaming aggregation with low-latency and high-reliability for edge computing. IEICE Trans. Inf. Syst. J104-D(5), 463–475 (2021). (in Japanese)
Deshpande, A., Guestrin, C., Madden, S.R., Hellerstein, J.M., Hong, W.: Model-based approximate querying in sensor networks. VLDB J. 14(4), 417–443 (2005)
Enders, C.K.: Applied Missing Data Analysis. Guilford Press, New York (2010)
Huang, Q., Lee, P.P.C.: Toward high-performance distributed stream processing via approximate fault tolerance. Proc. VLDB 10(3), 73–84 (2016)
Hwang, J.H., Balazinska, M., Rasin, A., Çetintemel, U., Stonebraker, M., Zdonik, S.: High-availability algorithms for distributed stream processing. In: Proceedings of ICDE, pp. 779–790, April 2005
Neumeyer, L., Robbins, B., Nair, A., Kesari, A.: S4: Distributed stream computing platform. In: 2010 IEEE International Conference on Data Mining Workshops, pp. 170–177, January 2010
Rasmussen, E., Williams, C.K.I.: Gaussian Processes for Machine Learning. The MIT Press, Cambridge (2006)
Shi, W., Cao, J., Zhang, Q., Li, Y., Xu, L.: Edge computing: vision and challenges. IEEE Internet Things J. 3(5), 637–646 (2016)
Zaharia, M., Das, T., Li, H., Hunter, T., Shenker, S., Stoica, I.: Discretized streams: a fault-tolerant model for scalable stream processing. Technical report, California University of Berkeley, Department of Electrical Engineering and Computer Science, (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Takao, D., Sugiura, K., Ishikawa, Y. (2021). Approximate Fault Tolerance for Edge Stream Processing. In: Kotsis, G., et al. Database and Expert Systems Applications - DEXA 2021 Workshops. DEXA 2021. Communications in Computer and Information Science, vol 1479. Springer, Cham. https://doi.org/10.1007/978-3-030-87101-7_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-87101-7_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87100-0
Online ISBN: 978-3-030-87101-7
eBook Packages: Computer ScienceComputer Science (R0)