Abstract
We consider the problem of collaborative search across a large number of digital libraries and query routing strategies in a peer-to-peer (P2P) environment. Both digital libraries and users are equally viewed as peers and, thus, as part of the P2P network. Our system provides a versatile platform for a scalable search engine combining local index structures of autonomous peers with a global directory based on a distributed hash table (DHT) as an overlay network. Experiments with the MINERVA prototype testbed study the benefits and costs of P2P search for keyword queries.
Minerva is the Roman goddess of science, wisdom, and learning.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aberer, K., Punceva, M., Hauswirth, M., Schmidt, R.: Improving data access in p2p systems. IEEE Internet Computing 6(1), 58–67 (2002)
Alonso, G., Casati, F., Kuno, H.: Web Services - Concepts, Architectures and Applications. Springer, Heidelberg (2004)
Bender, M., Michel, S., Weikum, G., Zimmer, C.: The minerva project: Database selection in the context of p2p search. In: BTW (2005)
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)
Buchmann, E., Böhm, K.: How to Run Experiments with Large Peer-to-Peer Data Structures. In: Proceedings of the 18th International Parallel and Distributed Processing Symposium, Santa Fe, USA (April 2004)
Callan, J.: Distributed information retrieval. In: Advances in information retrieval, pp. 127–150. Kluwer Academic Publishers, Dordrecht (2000)
Callan, J.P., Lu, Z., Bruce Croft, W.: Searching distributed collections with inference networks. In: Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 21–28. ACM Press, New York (1995)
Chakrabarti, S.: Mining the Web: Discovering Knowledge from Hypertext Data. Morgan Kaufmann, San Francisco (2002)
Cohen, E., Fiat, A., Kaplan, H.: Associative search in peer to peer networks: Harnessing latent semantics. In: Proceedings of the IEEE INFOCOM 2003 Conference (April 2003)
Crespo, A., Garcia-Molina, H.: Routing indices for peer-to-peer systems. In: Proc. of the 28th Conference on Distributed Computing Systems (July 2002)
Crespo, A., Garcia-Molina, H.: Semantic Overlay Networks for P2P Systems. Technical report, Stanford University (October 2002)
Cuenca-Acuna, F.M., Peery, C., Martin, R.P., Nguyen, T.D.: PlanetP: Using Gossiping to Build Content Addressable Peer-to-Peer Information Sharing Communities. Technical Report DCS-TR-487, Rutgers University (September 2002)
Fagin, R.: Combining fuzzy information from multiple systems. J. Comput. Syst. Sci. 58(1), 83–99 (1999)
Fuhr, N.: A decision-theoretic approach to database selection in networked IR. ACM Transactions on Information Systems 17(3), 229–249 (1999)
Grabs, T., Böhm, K., Schek, H.-J.: Powerdb-ir: information retrieval on top of a database cluster. In: Proceedings of the tenth international conference on Information and knowledge management, pp. 411–418. ACM Press, New York (2001)
Gravano, L., Garcia-Molina, H., Tomasic, A.: Gloss: text-source discovery over the internet. ACM Trans. Database Syst. 24(2), 229–264 (1999)
Karger, D., Lehman, E., Leighton, T., Levine, M., Lewin, D., Panigrahy, R.: Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web. In: ACM Symposium on Theory of Computing, May 1997, pp. 654–663 (1997)
Litwin, W., Neimat, M.-A., Schneider, D.A.: Lh* – a scalable, distributed data structure. ACM Trans. Database Syst. 21(4), 480–525 (1996)
Löser, A., Naumann, F., Siberski, W., Nejdl, W., Thaden, U.: Semantic overlay clusters within super-peer networks. In: Proceedings of the International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2003), pp. 33–47 (2003)
Lu, J., Callan, J.: Content-based retrieval in hybrid peer-to-peer networks. In: Proceedings of the twelfth international conference on Information and knowledge management, pp. 199–206. ACM Press, New York (2003)
Melnik, S., Raghavan, S., Yang, B., Garcia-Molina, H.: Building a distributed full-text index for the web. ACM Trans. Inf. Syst. 19(3), 217–241 (2001)
Meng, W., Yu, C.T., Liu, K.-L.: Building efficient and effective metasearch engines. ACM Computing Surveys 34(1), 48–89 (2002)
Nottelmann, H., Fuhr, N.: Evaluating different methods of estimating retrieval quality for resource selection. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp. 290–297. ACM Press, New York (2003)
Ratnasamy, S., Francis, P., Handley, M., Karp, R., Schenker, S.: A scalable content-addressable network. In: Proceedings of ACM SIGCOMM 2001, pp. 161–172. ACM Press, New York (2001)
Reynolds, P., Vahdat, A.: Efficient peer-to-peer keyword searching. In: Endler, M., Schmidt, D.C. (eds.) Middleware 2003. LNCS, vol. 2672, pp. 21–40. Springer, Heidelberg (2003)
Rowstron, A., Druschel, P.: Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, pp. 329–350. Springer, Heidelberg (2001)
Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. In: Proceedings of the 2001 conference on applications, technologies, architectures, and protocols for computer communications, pp. 149–160. ACM Press, New York (2001)
Suel, T., Mathur, C., Wu, J., Zhang, J., Delis, A., Kharrazi, M., Long, X., Shanmugasunderam, K.: Odissea: A peer-to-peer architecture for scalable web search and information retrieval. Technical report, Polytechnic Univ. (2003)
Tang, C., Xu, Z., Dwarkadas, S.: Peer-to-peer information retrieval using self-organizing semantic overlay networks. In: Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, pp. 175–186. ACM Press, New York (2003)
Theobald, M., Weikum, G., Schenkel, R.: Top-k query evaluation with probabilistic guarantees. In: VLDB, pp. 648–659 (2004)
Vingralek, R., Breitbart, Y., Weikum, G.: Snowball: Scalable storage on networks of workstations with balanced load. Distributed and Parallel Databases 6(2), 117–156 (1998)
Wu, Z., Meng, W., Yu, C.T., Li, Z.: Towards a highly-scalable and effective metasearch engine. In: World Wide Web, pp. 386–395 (2001)
Yang, B., Garcia-Molina, H.: Improving search in peer-to-peer networks. In: Proceedings of the 22nd International Conference on Distributed Computing Systems (ICDCS 2002), pp. 5–14. IEEE Computer Society, Los Alamitos (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bender, M., Michel, S., Zimmer, C., Weikum, G. (2005). The MINERVA Project: Towards Collaborative Search in Digital Libraries Using Peer-to-Peer Technology. In: Türker, C., Agosti, M., Schek, HJ. (eds) Peer-to-Peer, Grid, and Service-Orientation in Digital Library Architectures. Lecture Notes in Computer Science, vol 3664. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11549819_6
Download citation
DOI: https://doi.org/10.1007/11549819_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28711-7
Online ISBN: 978-3-540-28712-4
eBook Packages: Computer ScienceComputer Science (R0)