Abstract
Membership query of dynamic sets is essential for applications which generate or process a continuous stream of data items. These applications often require to cache items dynamically and answer membership queries for duplicate detection on unbounded data streams. Three key challenges for the caching mechanism are the limited memory space, high precision requirement and different priority-levels related with items. In this paper, we propose a compact in-memory index, Bloom Filter Ring (BFR), which is more suitable for dynamic caching of items on unbounded data streams. We demonstrate the time complexity and precision of BFR in finite memory space, and theoretically prove that BFR has higher expectation of average capacity than Aging Bloom Filter, the current state of art. Furthermore, we propose Priority-aware BFR (PBFR) to support membership query scheme which takes into account priority levels of items. Experimental results show that our algorithms gain better performance in term of cache hit ratio and false negative rate.
Y. Wang—This work was supported by the National High Technology Research and Development Program of China (No. 013AA013205, No. 2012AA013001), the National Natural Science Foundation of China (No. 61271275, No. 61501457)
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Lynch, C.: Big data: how do your data grow? Nature 455, 28–29 (2008)
Asadi, N., Lin, J.: Fast candidate generation for real-time tweet search with bloom filter chains. ACM Trans. Inf. Syst. (TOIS) 31, 13 (2013). ACM
Bhoraskar, R., Gabale, V., Kulkarni, P., et al.: Importance-aware bloom filter for managing set membership queries on streaming data. In: 5th International Conference on Communication Systems and Networks (COMSNETS), pp. 1–10. IEEE (2013)
Fan, L., Cao, P., Almeida, J., et al.: A scalable wide-area web cache sharing protocol. ACM SIGCOMM Comput. Commun. Rev. 28(4), 254–265 (1998)
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)
Jin, J., Ahn, S., Oh, H.: A multipath routing protocol based on bloom filter for multi-hop wireless networks. Mob. Inf. Syst. 2016, 1–10 (2016)
Yang, T., Liu, A.X., Shahzad, M., et al.: A shifting bloom filter framework for set queries. Proc. VLDB Endow. 9(5), 408–419 (2015)
Liu, W., Qu, W., Gong, J., et al.: Detection of superpoints using a vector bloom filter. IEEE Trans. Inf. Forensics Secur. 11(3), 514–527 (2016)
Deng, F., Rafiei, D.: Approximately detecting duplicates for streaming data using stable bloom filters. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pp. 25–36. ACM (2006)
Shen, H., Zhang, Y.: Improved approximate detection of duplicates for data streams over sliding windows. J. Comput. Sci. Technol. 23(6), 973–987 (2008)
Guo, D., Wu, J., Chen, H., et al.: Theory and network applications of dynamic bloom filters. In: INFOCOM, pp. 1–12. IEEE (2006)
Asadi, N., Lin, J.: Fast candidate generation for real-time tweet search with bloom filter chains. ACM Trans. Inf. Syst. (TOIS) 31(3), 13 (2013)
Feng, W., Shin, K.G., Kandlur, D.D., et al.: The BLUE active queue management algorithms. IEEE/ACM Trans. Netw. (ToN) 10(4), 513–528 (2002)
Yoon, M.K.: Aging bloom filter with two active buffers for dynamic sets. IEEE Trans. Knowl. Data Eng. 22(1), 134–138 (2010). IEEE
Hu, H.S., Zhao, H.W., Mi, F.: L-priorities bloom filter: a new member of the bloom filter family. Int. J. Autom. Comput. 9(2), 171–176 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, Y., Yun, X., Wang, S., Wang, X. (2016). A Compact In-memory Index for Managing Set Membership Queries on Streaming Data. In: Wang, Y., Yu, G., Zhang, Y., Han, Z., Wang, G. (eds) Big Data Computing and Communications. BigCom 2016. Lecture Notes in Computer Science(), vol 9784. Springer, Cham. https://doi.org/10.1007/978-3-319-42553-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-42553-5_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42552-8
Online ISBN: 978-3-319-42553-5
eBook Packages: Computer ScienceComputer Science (R0)