Skip to main content

A Compact In-memory Index for Managing Set Membership Queries on Streaming Data

  • Conference paper
  • First Online:
Book cover Big Data Computing and Communications (BigCom 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9784))

Included in the following conference series:

  • 1528 Accesses

Abstract

Membership query of dynamic sets is essential for applications which generate or process a continuous stream of data items. These applications often require to cache items dynamically and answer membership queries for duplicate detection on unbounded data streams. Three key challenges for the caching mechanism are the limited memory space, high precision requirement and different priority-levels related with items. In this paper, we propose a compact in-memory index, Bloom Filter Ring (BFR), which is more suitable for dynamic caching of items on unbounded data streams. We demonstrate the time complexity and precision of BFR in finite memory space, and theoretically prove that BFR has higher expectation of average capacity than Aging Bloom Filter, the current state of art. Furthermore, we propose Priority-aware BFR (PBFR) to support membership query scheme which takes into account priority levels of items. Experimental results show that our algorithms gain better performance in term of cache hit ratio and false negative rate.

Y. Wang—This work was supported by the National High Technology Research and Development Program of China (No. 013AA013205, No. 2012AA013001), the National Natural Science Foundation of China (No. 61271275, No. 61501457)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lynch, C.: Big data: how do your data grow? Nature 455, 28–29 (2008)

    Article  Google Scholar 

  2. Asadi, N., Lin, J.: Fast candidate generation for real-time tweet search with bloom filter chains. ACM Trans. Inf. Syst. (TOIS) 31, 13 (2013). ACM

    Article  Google Scholar 

  3. Bhoraskar, R., Gabale, V., Kulkarni, P., et al.: Importance-aware bloom filter for managing set membership queries on streaming data. In: 5th International Conference on Communication Systems and Networks (COMSNETS), pp. 1–10. IEEE (2013)

    Google Scholar 

  4. Fan, L., Cao, P., Almeida, J., et al.: A scalable wide-area web cache sharing protocol. ACM SIGCOMM Comput. Commun. Rev. 28(4), 254–265 (1998)

    Article  Google Scholar 

  5. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)

    Article  MATH  Google Scholar 

  6. Jin, J., Ahn, S., Oh, H.: A multipath routing protocol based on bloom filter for multi-hop wireless networks. Mob. Inf. Syst. 2016, 1–10 (2016)

    Google Scholar 

  7. Yang, T., Liu, A.X., Shahzad, M., et al.: A shifting bloom filter framework for set queries. Proc. VLDB Endow. 9(5), 408–419 (2015)

    Article  Google Scholar 

  8. Liu, W., Qu, W., Gong, J., et al.: Detection of superpoints using a vector bloom filter. IEEE Trans. Inf. Forensics Secur. 11(3), 514–527 (2016)

    Article  Google Scholar 

  9. Deng, F., Rafiei, D.: Approximately detecting duplicates for streaming data using stable bloom filters. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pp. 25–36. ACM (2006)

    Google Scholar 

  10. Shen, H., Zhang, Y.: Improved approximate detection of duplicates for data streams over sliding windows. J. Comput. Sci. Technol. 23(6), 973–987 (2008)

    Article  Google Scholar 

  11. Guo, D., Wu, J., Chen, H., et al.: Theory and network applications of dynamic bloom filters. In: INFOCOM, pp. 1–12. IEEE (2006)

    Google Scholar 

  12. Asadi, N., Lin, J.: Fast candidate generation for real-time tweet search with bloom filter chains. ACM Trans. Inf. Syst. (TOIS) 31(3), 13 (2013)

    Article  Google Scholar 

  13. Feng, W., Shin, K.G., Kandlur, D.D., et al.: The BLUE active queue management algorithms. IEEE/ACM Trans. Netw. (ToN) 10(4), 513–528 (2002)

    Article  Google Scholar 

  14. Yoon, M.K.: Aging bloom filter with two active buffers for dynamic sets. IEEE Trans. Knowl. Data Eng. 22(1), 134–138 (2010). IEEE

    Article  Google Scholar 

  15. Hu, H.S., Zhao, H.W., Mi, F.: L-priorities bloom filter: a new member of the bloom filter family. Int. J. Autom. Comput. 9(2), 171–176 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Wang, Y., Yun, X., Wang, S., Wang, X. (2016). A Compact In-memory Index for Managing Set Membership Queries on Streaming Data. In: Wang, Y., Yu, G., Zhang, Y., Han, Z., Wang, G. (eds) Big Data Computing and Communications. BigCom 2016. Lecture Notes in Computer Science(), vol 9784. Springer, Cham. https://doi.org/10.1007/978-3-319-42553-5_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42553-5_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42552-8

  • Online ISBN: 978-3-319-42553-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics