Abstract
Compression algorithm can drastically reduce the volume of spatiotemporal big data. However, lossy compression techniques are hardly suitable due to its inherently random nature. They often impose unpredictable damage to scientific data, making them unsuitable for data analysis and visualization that require certain precision. In this paper, we propose a tree-based indexing method using Hilbert curve. The key idea of this method is that it divides the space into minimum bounding rectangles according to the similarity of the data. Our algorithm is able to select appropriate minimum bounding rectangles according to the given maximum acceptable error and use the average value contained in each selected MBR to replace the original data to achieve data compression. We propose the corresponding tree construction algorithm and range query processing algorithm for the indexing structure mentioned above. Experimental results emphasize the superiority of our method over traditional quadrant-based minimum bounding rectangle tree.
The authors extend their appreciation to National Key Research and Development Program of China (International Technology Cooperation Project No.2021YFE014400) and National Science Foundation of China (No.42175194) for funding this work
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aekyeung, M., et al.: Lossy compression on IoT big data by exploiting spatiotemporal correlation. In: 2017 IEEE High Performance Extreme Computing Conference (HPEC). IEEE (2017)
Jo, B., Jung, S.: Quadrant-based minimum bounding rectangle-tree indexing method for similarity queries over big spatial data in HBase. Sensors 18(9), 3032 (2018)
Jo, B, Jung, S.: Quadrant-based MBR-tree indexing technique for range query over HBase[C]. In: Proceedings of the 7th International Conference on Emerging Databases: Technologies, Applications, and Theory. Springer Singapore, 2018: 14–24. Liang, Xin, et al. “Error-controlled lossy compression optimized for high compression ratios of scientific datasets”. 2018 IEEE International Conference on Big Data (Big Data). IEEE (2018)
Ainsworth, M., Tugluk, O., Whitney, B., Klasky, S.: Multilevel techniques for compression and reduction of scientific data—the univariate case. Comput. Vis. Sci. 19(5–6), 65–76 (2018). https://doi.org/10.1007/s00791-018-00303-9
Lee, J.-G., Kang, M.: Geospatial big data: challenges and opportunities. Big Data Res. 2(2), 74–81 (2015)
Ahmed, E., Mokbel, M.F.: The era of big spatial data: A survey. Found. Trends® Databases 6(3–4), 163–273 (2016)
Eldawy, A.; Mokbel, M.F.: The era of big spatial data: a survey. In: Proceedings of the IEEE 31st International Conference on Data Engineering Workshops, Seoul, Korea, 13–17 April 2015, pp. 42–49
Ratanaworabhan, P., Ke, J., Burtscher, M.: Fast lossless compression of scientific floating-point data. In: DCC 2006, pp. 133–142 (2006)
Richard H., Steve D., Bruce T.: Parallel Pro-cessing Algorithms for GIS. Taylor & Francis Ltd, UK (1998)
Liu, H., Ma, H., El Zarki, M., et al.: Error control schemes for networks: an overview. Mob. Netw. Appl. 2, 167–182 (1997)
Burton, H.O., Sullivan, D.D.: Errors and error control. Proc. IEEE 60(11), 1293–1301 (1972)
Beckmann, N., Kriegel, H.-P., Schneider, R., Seeger, B.: The R*-tree: an efficient and robustaccess method for points and rectangles. In: ACM SIGMOD, pp. 322–331 (1990)
Jo, B., Jung, S.: Quadrant-based MBR-tree indexing technique for range query over HBase. In: Proceedings
Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proceedings of the ACM STGMOD, pp. 47–57 (1984)
Hilbert, D.: Uber die stetige Abbildung einer Linie auf ein Flachenstuck. Math. Ann. 38, 459–460 (1891)
Austin, E.: Advanced photon source. Synchrotron Radiat. News, 29(2), 29–30 (2016)
Mandelbrot, B.: Fractal Geometry of Nature. W.H. Freeman, New York (1977)
Kamel, I., Faloutsos, C.: Hilbert R-Tree: An Improved R-Tree Using Fractals (1999)
Jagadish, H.V.: Linear clustering of objects with multiple attributes. In: Proceedings of the ACM SIGMOD Conference, pp. 332–342 (1990)
Kumar, A.A., Makur, A.: Lossy compression of encrypted image by compressive sensing technique. In: TENCON 2009–2009 IEEE Region 10 Conference, pp. 1–5 IEEE (2009)
Ochoa, I., Hernaez, M., Goldfeder, R., et al.: Effect of lossy compression of quality scores on variant calling. Brief. Bioinform. 18(2), 183–194 (2017)
Griffiths, J.G.: An algorithm for displaying a class of space-filling curves. Softw.-Pract. Exp. 16(5), 403–411 (1986)
Tao, D., Di, S., Chen, Z., Cappello, F.: Significantly improving lossy compression for scientific data sets based on multidimensional prediction and error-controlled quantization. In: IEEE International Parallel and Distributed Processing Symposium IPDPS2017, Orlando, Florida, USA, 29 May– 2 June, pp. 1129–1139 (2017)
Lindstrom, P.: Fixed-rate compressed floating-point arrays. IEEE Trans. Vis. Comput. Graph. 20(12), 2674–2683 (2014)
Foster, I., et al.: Computing Just what you need: online data analysis and reduction at extreme scales. In: Rivera, F.F., Pena, T.F., Cabaleiro, J.C. (eds.) Euro-Par 2017. LNCS, vol. 10417, pp. 3–19. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64203-1_1
Tao, D., Di, S., Chen, Z., Cappello, F.: In-depth exploration of single-snapshot lossy compression techniques for N-body simulations. In: IEEE International Conference on Big Data (BigData17) (2017)
Baker, A.H., et al.: A methodology for evaluating the impact of data compression on climate simulation data. In: HPDC 2014, pp. 203–214 (2014)
Faloutsos, C.: Gray codes for partial match and range queries. IEEE Trans. Softw. Eng. 14(10), 1381–1393 (1988). early version available as UMIACS-TR-87–4, also CS-TR-1796
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, Z., Guan, R., Pan, X., Song, B., Zhang, X., Tian, Y. (2023). Efficient Spatiotemporal Big Data Indexing Algorithm with Loss Control. In: Tian, Y., Ma, T., Jiang, Q., Liu, Q., Khan, M.K. (eds) Big Data and Security. ICBDS 2022. Communications in Computer and Information Science, vol 1796. Springer, Singapore. https://doi.org/10.1007/978-981-99-3300-6_37
Download citation
DOI: https://doi.org/10.1007/978-981-99-3300-6_37
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-3299-3
Online ISBN: 978-981-99-3300-6
eBook Packages: Computer ScienceComputer Science (R0)