Skip to main content

Flexible Bloom Filters for Searching Textual Objects

  • Conference paper
Agents and Peer-to-Peer Computing (AP2PC 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5319))

Included in the following conference series:

  • 309 Accesses

Abstract

Efficient object searching mechanisms are essential in large-scale networks. Many studies have been done on distributed hash tables (DHTs), which are a kind of peer-to-peer system. In DHT networks, we can certainly get the desired objects if they exist. However, multi-word searches generate much communication traffic. Many studies have tried to reduce this traffic by using bloom filters, which are space-efficient probabilistic data structures. In using such filters, all nodes in a DHT must share their false positive rate parameter. However, the best false positive rate differs from one node to another. In this paper, we provide a method of determining the best false positive rate, and we use a new filter called a flexible bloom filter, to which each node can set the approximately best false positive rate. Experiments showed that the flexible bloom filter was able to greatly reduce the traffic.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Stoica, I., Robert, K.D., Kaashoek, F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for Internet applications. In: Proceedings of the 2001 ACM SIGCOMM Conference, pp. 149–160 (2001)

    Google Scholar 

  2. Ratnasamy, S., Francis, P., Handley, M., Karp, R., Schenker, S.: A scalable content-addressable network. In: Proceedings of the ACM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, August 2001, pp. 161–172 (2001)

    Google Scholar 

  3. Rowstron, A.I.T., Druschel, P.: Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In: Symposium on Operating Systems Principles, pp. 188–201 (2001)

    Google Scholar 

  4. Li, J., Loo, B.T., Hellerstein, J.M., Kaashoek, F., Karger, D.R., Morris, R.: On the feasibility of peer-to-peer web indexing and search. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735. Springer, Heidelberg (2003)

    Google Scholar 

  5. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)

    Article  MATH  Google Scholar 

  6. Broder, A., Mitzenmacher, M.: Network applications of bloom filters: A survey. In: Proceedings of 40th Annual Allerton Conference on Communication, Control, and Computing, pp. 636–646 (2002)

    Google Scholar 

  7. Zhang, J., Suel, T.: Efficient query evaluation on large textual collections in a peer-to-peer environment. In: Peer-to-Peer Computing, pp. 225–233 (2005)

    Google Scholar 

  8. Reynolds, P., Vahdat, A.: Efficient peer-to-peer keyword searching. In: Endler, M., Schmidt, D.C. (eds.) Middleware 2003. LNCS, vol. 2672, pp. 21–40. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  9. Sei, Y., Matsuzaki, K., Honiden, S.: An algorithm to reduce the communication traffic for multi-word search in a distributed hash table. In: Proceedings of 4th IFIP International Conference on Theoretical Computer Science, pp. 115–129 (2006)

    Google Scholar 

  10. Mullin, J.K.: Accessing textual documents using compressed indexes of arrays of small bloom filters. Computer Journal 30(4), 343–348 (1987)

    Article  MathSciNet  Google Scholar 

  11. Shepherd, M.A., Phillips, W.J., Chu, C.K.: A fixed-size bloom filter for searching textual documents. Computer Journal 32(3), 212–219 (1989)

    Article  Google Scholar 

  12. Mitzenmacher, M.: Compressed bloom filters. In: Proceedings of the twentieth annual ACM symposium on Principles of distributed computing, pp. 144–150 (2001)

    Google Scholar 

  13. Adler, M., Chakrabarti, S., Mitzenmacher, M., Rasmussen, L.: Parallel randomized load balancing, pp. 238–247 (1995)

    Google Scholar 

  14. Miller, G.: Wordnet an on-line lexical database. International Journal of Lexicographer 3(4), (special issue) (1990)

    Google Scholar 

  15. Eastlake III, D., Jones, P.: US Secure Hash Algorithm 1 (SHA1). RFC 3174 (September 2001)

    Google Scholar 

  16. Moffat, A., Bell, T., Witten, I.: Lossless compression for text and images. International Journal of High Speed Electronics and Systems 8(1), 179–231 (1997)

    Article  Google Scholar 

  17. ISO/IEC TR 15938-8:2002: Information technology. Multimedia content description interface. part 8: Extraction and use of mpeg-7 descriptions, ISO/IEC/JTC 1/SC 29 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sei, Y., Matsuzaki, K., Honiden, S. (2010). Flexible Bloom Filters for Searching Textual Objects. In: Joseph, S.R.H., Despotovic, Z., Moro, G., Bergamaschi, S. (eds) Agents and Peer-to-Peer Computing. AP2PC 2007. Lecture Notes in Computer Science(), vol 5319. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11368-0_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-11368-0_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-11367-3

  • Online ISBN: 978-3-642-11368-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics