Abstract
In real applications, various cleaning strategies are adopted to repair a specific time series several times for better effects. These multiple versions of the repaired time series, along with the raw time series, are often stored directly in the system for the users. However, as the scale of data explodes, high storage cost becomes a non-negligible problem. To address this problem, we propose RpDelta, a repaired time series storage strategy, under which a repaired time series can be represented as the combination of the raw time series and a differential file to use the storage space more efficiently. Meanwhile, we design a sequential reading strategy based on a finite state machine to make RpDelta adaptive to practical uses, which will almost not introduce additional time and space overheads. We also take the UCR-Suite algorithm as an example to introduce our optimizations on a simultaneous-operation circumstance with the help of RpDelta’s properties. The extensive experiments show the effectiveness and efficiency of our work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Baba, A.I., Jaeger, M., Lu, H., et al.: Learning-based cleansing for indoor RFID data. In: SIGMOD Conference, pp. 925–936. ACM (2016)
Benkő, Z., Bábel, T., Somogyvári, Z.: Model-free detection of unique events in time series. Sci. Rep. 12(1), 227 (2022)
Dau, H.A., Keogh, E., Kamgar, K., et al.: The UCR time series classification archive (2018). https://www.cs.ucr.edu/~eamonn/time_series_data_2018/
Jeffery, S.R., Alonso, G., Franklin, M.J., Hong, W., Widom, J.: Declarative support for sensor data cleaning. In: Fishkin, K.P., Schiele, B., Nixon, P., Quigley, A. (eds.) Pervasive 2006. LNCS, vol. 3968, pp. 83–100. Springer, Heidelberg (2006). https://doi.org/10.1007/11748625_6
Linardi, M., Palpanas, T.: Scalable data series subsequence matching with ULISSE. VLDB J. 29(6), 1449–1474 (2020). https://doi.org/10.1007/s00778-020-00619-4
Rakthanmanon, T., Campana, B.J.L., Mueen, A., et al.: Searching and mining trillions of time series subsequences under dynamic time warping. In: KDD, pp. 262–270. ACM (2012)
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)
Sathe, S., Papaioannou, T.G., Jeung, H., Aberer, K.: A survey of model-based sensor data acquisition and management. In: Aggarwal, C. (ed.) Managing and Mining Sensor Data, pp. 9–50. Springer, Boston (2013). https://doi.org/10.1007/978-1-4614-6309-2_2
Song, S., Zhang, A.: IoT data quality. In: CIKM, pp. 3517–3518. ACM (2020)
Song, S., Zhang, A., Wang, J., Yu, P.S.: SCREEN: stream data cleaning under speed constraints. In: SIGMOD Conference, pp. 827–841. ACM (2015)
Wang, X., Wang, C.: Time series data cleaning: a survey. IEEE Access 8, 1866–1881 (2020)
Wu, J., Wang, P., Pan, N., et al.: KV-match: a subsequence matching approach supporting normalization and time warping. In: ICDE, pp. 866–877. IEEE (2019)
Xu, S., Lu, B., Baldea, M., et al.: Data cleaning in the process industries. Rev. Chem. Eng. 31, 453–490 (2015)
Acknowledgements
The authors would like to thank all the anonymous reviewers for their insightful comments and suggestions. This work was supported by the National Key R &D Program of China (No. 2021YFB3300502).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Han, X., Ye, F., He, Z., Wang, X.S., Song, Y., Liu, C. (2023). RpDelta: Supporting UCR-Suite on Multi-versioning Time Series Data. In: Wang, X., et al. Database Systems for Advanced Applications. DASFAA 2023. Lecture Notes in Computer Science, vol 13943. Springer, Cham. https://doi.org/10.1007/978-3-031-30637-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-30637-2_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30636-5
Online ISBN: 978-3-031-30637-2
eBook Packages: Computer ScienceComputer Science (R0)