Skip to main content

The MINERVA Project: Towards Collaborative Search in Digital Libraries Using Peer-to-Peer Technology

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3664))

Abstract

We consider the problem of collaborative search across a large number of digital libraries and query routing strategies in a peer-to-peer (P2P) environment. Both digital libraries and users are equally viewed as peers and, thus, as part of the P2P network. Our system provides a versatile platform for a scalable search engine combining local index structures of autonomous peers with a global directory based on a distributed hash table (DHT) as an overlay network. Experiments with the MINERVA prototype testbed study the benefits and costs of P2P search for keyword queries.

Minerva is the Roman goddess of science, wisdom, and learning.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aberer, K., Punceva, M., Hauswirth, M., Schmidt, R.: Improving data access in p2p systems. IEEE Internet Computing 6(1), 58–67 (2002)

    Article  Google Scholar 

  2. Alonso, G., Casati, F., Kuno, H.: Web Services - Concepts, Architectures and Applications. Springer, Heidelberg (2004)

    MATH  Google Scholar 

  3. Bender, M., Michel, S., Weikum, G., Zimmer, C.: The minerva project: Database selection in the context of p2p search. In: BTW (2005)

    Google Scholar 

  4. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)

    Article  MATH  Google Scholar 

  5. Buchmann, E., Böhm, K.: How to Run Experiments with Large Peer-to-Peer Data Structures. In: Proceedings of the 18th International Parallel and Distributed Processing Symposium, Santa Fe, USA (April 2004)

    Google Scholar 

  6. Callan, J.: Distributed information retrieval. In: Advances in information retrieval, pp. 127–150. Kluwer Academic Publishers, Dordrecht (2000)

    Google Scholar 

  7. Callan, J.P., Lu, Z., Bruce Croft, W.: Searching distributed collections with inference networks. In: Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 21–28. ACM Press, New York (1995)

    Chapter  Google Scholar 

  8. Chakrabarti, S.: Mining the Web: Discovering Knowledge from Hypertext Data. Morgan Kaufmann, San Francisco (2002)

    Google Scholar 

  9. Cohen, E., Fiat, A., Kaplan, H.: Associative search in peer to peer networks: Harnessing latent semantics. In: Proceedings of the IEEE INFOCOM 2003 Conference (April 2003)

    Google Scholar 

  10. Crespo, A., Garcia-Molina, H.: Routing indices for peer-to-peer systems. In: Proc. of the 28th Conference on Distributed Computing Systems (July 2002)

    Google Scholar 

  11. Crespo, A., Garcia-Molina, H.: Semantic Overlay Networks for P2P Systems. Technical report, Stanford University (October 2002)

    Google Scholar 

  12. Cuenca-Acuna, F.M., Peery, C., Martin, R.P., Nguyen, T.D.: PlanetP: Using Gossiping to Build Content Addressable Peer-to-Peer Information Sharing Communities. Technical Report DCS-TR-487, Rutgers University (September 2002)

    Google Scholar 

  13. Fagin, R.: Combining fuzzy information from multiple systems. J. Comput. Syst. Sci. 58(1), 83–99 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  14. Fuhr, N.: A decision-theoretic approach to database selection in networked IR. ACM Transactions on Information Systems 17(3), 229–249 (1999)

    Article  Google Scholar 

  15. Grabs, T., Böhm, K., Schek, H.-J.: Powerdb-ir: information retrieval on top of a database cluster. In: Proceedings of the tenth international conference on Information and knowledge management, pp. 411–418. ACM Press, New York (2001)

    Chapter  Google Scholar 

  16. Gravano, L., Garcia-Molina, H., Tomasic, A.: Gloss: text-source discovery over the internet. ACM Trans. Database Syst. 24(2), 229–264 (1999)

    Article  Google Scholar 

  17. Karger, D., Lehman, E., Leighton, T., Levine, M., Lewin, D., Panigrahy, R.: Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web. In: ACM Symposium on Theory of Computing, May 1997, pp. 654–663 (1997)

    Google Scholar 

  18. Litwin, W., Neimat, M.-A., Schneider, D.A.: Lh* – a scalable, distributed data structure. ACM Trans. Database Syst. 21(4), 480–525 (1996)

    Article  Google Scholar 

  19. Löser, A., Naumann, F., Siberski, W., Nejdl, W., Thaden, U.: Semantic overlay clusters within super-peer networks. In: Proceedings of the International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2003), pp. 33–47 (2003)

    Google Scholar 

  20. Lu, J., Callan, J.: Content-based retrieval in hybrid peer-to-peer networks. In: Proceedings of the twelfth international conference on Information and knowledge management, pp. 199–206. ACM Press, New York (2003)

    Chapter  Google Scholar 

  21. Melnik, S., Raghavan, S., Yang, B., Garcia-Molina, H.: Building a distributed full-text index for the web. ACM Trans. Inf. Syst. 19(3), 217–241 (2001)

    Article  Google Scholar 

  22. Meng, W., Yu, C.T., Liu, K.-L.: Building efficient and effective metasearch engines. ACM Computing Surveys 34(1), 48–89 (2002)

    Article  Google Scholar 

  23. Nottelmann, H., Fuhr, N.: Evaluating different methods of estimating retrieval quality for resource selection. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp. 290–297. ACM Press, New York (2003)

    Chapter  Google Scholar 

  24. Ratnasamy, S., Francis, P., Handley, M., Karp, R., Schenker, S.: A scalable content-addressable network. In: Proceedings of ACM SIGCOMM 2001, pp. 161–172. ACM Press, New York (2001)

    Google Scholar 

  25. Reynolds, P., Vahdat, A.: Efficient peer-to-peer keyword searching. In: Endler, M., Schmidt, D.C. (eds.) Middleware 2003. LNCS, vol. 2672, pp. 21–40. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  26. Rowstron, A., Druschel, P.: Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, pp. 329–350. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  27. Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. In: Proceedings of the 2001 conference on applications, technologies, architectures, and protocols for computer communications, pp. 149–160. ACM Press, New York (2001)

    Chapter  Google Scholar 

  28. Suel, T., Mathur, C., Wu, J., Zhang, J., Delis, A., Kharrazi, M., Long, X., Shanmugasunderam, K.: Odissea: A peer-to-peer architecture for scalable web search and information retrieval. Technical report, Polytechnic Univ. (2003)

    Google Scholar 

  29. Tang, C., Xu, Z., Dwarkadas, S.: Peer-to-peer information retrieval using self-organizing semantic overlay networks. In: Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, pp. 175–186. ACM Press, New York (2003)

    Chapter  Google Scholar 

  30. Theobald, M., Weikum, G., Schenkel, R.: Top-k query evaluation with probabilistic guarantees. In: VLDB, pp. 648–659 (2004)

    Google Scholar 

  31. Vingralek, R., Breitbart, Y., Weikum, G.: Snowball: Scalable storage on networks of workstations with balanced load. Distributed and Parallel Databases 6(2), 117–156 (1998)

    Article  Google Scholar 

  32. Wu, Z., Meng, W., Yu, C.T., Li, Z.: Towards a highly-scalable and effective metasearch engine. In: World Wide Web, pp. 386–395 (2001)

    Google Scholar 

  33. Yang, B., Garcia-Molina, H.: Improving search in peer-to-peer networks. In: Proceedings of the 22nd International Conference on Distributed Computing Systems (ICDCS 2002), pp. 5–14. IEEE Computer Society, Los Alamitos (2002)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bender, M., Michel, S., Zimmer, C., Weikum, G. (2005). The MINERVA Project: Towards Collaborative Search in Digital Libraries Using Peer-to-Peer Technology. In: Türker, C., Agosti, M., Schek, HJ. (eds) Peer-to-Peer, Grid, and Service-Orientation in Digital Library Architectures. Lecture Notes in Computer Science, vol 3664. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11549819_6

Download citation

  • DOI: https://doi.org/10.1007/11549819_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28711-7

  • Online ISBN: 978-3-540-28712-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics