Abstract
Learned bloom filter (LBF) model has been proposed in recent work to replace the traditional bloom filter (BF). It can reduce the needed amount of memory and achieve a relatively low false positive rate (FPR). However, the LBF did not provide a good solution for multi-dimensional data, such as spatial data. In this paper, a learned prefix bloom filter (LPBF) for spatial data is presented, which supports deletion and expansion and achieves lower FPR and less memory usage than the classical BF. To our knowledge, this is the first LBF method for spatial data. Specifically, a Z-order space-filling curve is used to map the spatial data into one dimension binary code. Then, we only need to learn the suffixes of the same prefix for the corresponding sub-LBF, which reduces the learning complexity of LBF. We further use the perfect hash table to accelerate the filter and reduce the FPR. Compared with two traditional BF methods and two state-of-art LBF methods on real spatial data sets, the proposed LPBF method shows the best performance in reducing FPR, proving that the LPBF method has great potential on bloom filter for spatial data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Alexiou, K., Kossmann, D., Larson, P.: Adaptive range filters for cold data: avoiding trips to siberia. Proc. VLDB Endow. 6(14), 1714–1725 (2013)
Belazzougui, D., Boldi, P., Pagh, R., Vigna, S.: Theory and practice of monotone minimal perfect hashing. ACM J. Exp. Algorithmics 16 (2011)
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)
Crainiceanu, A.: Bloofi: a hierarchical bloom filter index with applications to distributed data provenance. In: 2nd International Workshop on Cloud Intelligence, ACM VLDB 2013, pp. 4:1–4:8 (2013)
Crainiceanu, A., Lemire, D.: Bloofi: multidimensional bloom filters. Inf. Syst. 54, 311–324 (2015)
Dai, Z., Shrivastava, A.: Adaptive learned bloom filter (ada-bf): efficient utilization of the classifier with application to real-time information filtering on the web. In: Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020 (2020)
Davitkova, A., Gjurovski, D., Michel, S.: Compressing (multidimensional) learned bloom filters. In: Workshop on ML for Systems at NeurIPS 2021 (2021)
Ding, Y., Ma, Z., Wen, S., Xie, J., Chang, D., Si, Z., Wu, M., Ling, H.: AP-CNN: weakly supervised attention pyramid convolutional neural network for fine-grained visual classification. IEEE Trans. Image Process. 30, 2826–2836 (2021)
Fan, B., Andersen, D.G., Kaminsky, M., Mitzenmacher, M.: Cuckoo filter: practically better than bloom. In: Proceedings of the 10th ACM International on Conference on emerging Networking Experiments and Technologies, CoNEXT 2014, pp. 75–88. ACM (2014)
Fan, L., Cao, P., Almeida, J.M., Broder, A.Z.: Summary cache: a scalable wide-area web cache sharing protocol. IEEE/ACM Trans. Netw. 8(3), 281–293 (2000)
Guo, D., Wu, J., Chen, H., Luo, X.: Theory and network applications of dynamic bloom filters. In: 25th IEEE International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies, IEEE INFOCOM 2006 (2006)
Guo, D., Wu, J., Chen, H., Yuan, Y., Luo, X.: The dynamic bloom filters. IEEE Trans. Knowl. Data Eng. 22(1), 120–133 (2010)
Kraska, T., Beutel, A., Chi, E.H., Dean, J., Polyzotis, N.: The case for learned index structures. In: Proceedings of the 2018 International Conference on Management of Data, SIGMOD 2018, pp. 489–504. ACM (2018)
Li, S., Li, W., Cook, C., Zhu, C., Gao, Y.: Independently recurrent neural network (indrnn): building a longer and deeper RNN. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, pp. 5457–5466 (2018)
Macke, S., Beutel, A., Kraska, T., Sathiamoorthy, M., Cheng, D.Z., Chi, E.H.: Lifting the curse of multidimensional data with learned existence indexes. In: Workshop on ML for Systems at NeurIPS 2018 (2018)
Mitzenmacher, M.: A model for learned bloom filters and optimizing by sandwiching. In: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, pp. 462–471 (2018)
Mokbel, M.F., Aref, W.G.: Space-Filling Curves, Encyclopedia of GIS, pp. 1068–1072. (2008)
Natekin, A., Knoll, A.: Gradient boosting machines, a tutorial. Front. Neurorobotics 7 (2013)
Nayak, S., Patgiri, R.: countbf: a general-purpose high accuracy and space efficient counting bloom filter. In: 17th International Conference on Network and Service Management, CNSM 2021, Izmir, pp. 355–359. IEEE (2021)
Patgiri, R., Nayak, S., Borgohain, S.K.: RDBF: a r-dimensional bloom filter for massive scale membership query. J. Netw. Comput. Appl. 136, 100–113 (2019)
Rae, J.W., Bartunov, S., Lillicrap, T.P.: Meta-learning neural bloom filters. In: Proceedings of the 36th International Conference on Machine Learning, ICML 2019, vol. 97, pp. 5271–5280 (2019)
Ramsak, F., Markl, V., Fenk, R., Zirkel, M., Elhardt, K., Bayer, R.: Integrating the UB-tree into a database system kernel. In: Proceedings of 26th International Conference on Very Large Data Bases, VLDB 2000, pp. 263–272 (2000)
Wu, Y., et al.: Elastic bloom filter: deletable and expandablefilter using elastic fingerprints. IEEE Trans. Comput. 71, 1 (2021)
Xie, K., Min, Y., Zhang, D., Wen, J., Xie, G.: A scalable bloom filter for membership queries. In: Proceedings of the Global Communications Conference, GLOBECOM 2007, pp. 543–547. IEEE (2007)
Zhang, S., Ray, S., Lu, R., Zheng, Y.: SPRIG: a learned spatial index for range and KNN queries. In: Proceedings of the 17th International Symposium on Spatial and Temporal Databases, ACM SSTD 2021, pp. 96–105 (2021)
Acknowledgements
This work is supported in part by the National Key R &D Program of China (2018AAA0102100), the Scientific and Technological Innovation Leading Plan of High-tech Industry of Hunan Province (2020GK2021), the National Natural Science Foundation of China (61902434), the International Science and Technology Innovation Joint Base of Machine Vision and Medical Image Processing in Hunan Province (2021CB1013).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zou, B., Zeng, M., Zhu, C., Xiao, L., Chen, Z. (2022). A Learned Prefix Bloom Filter for Spatial Data. In: Strauss, C., Cuzzocrea, A., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2022. Lecture Notes in Computer Science, vol 13426. Springer, Cham. https://doi.org/10.1007/978-3-031-12423-5_26
Download citation
DOI: https://doi.org/10.1007/978-3-031-12423-5_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-12422-8
Online ISBN: 978-3-031-12423-5
eBook Packages: Computer ScienceComputer Science (R0)