Abstract
Peer-to-peer (P2P) networking continuously gains popularity among computing science researchers. The problem of information retrieval (IR) over P2P networks is being addressed by researchers attempting to provide valuable insight as well as solutions for its successful deployment. All published studies have, so far, been evaluated by simulation means, using well-known document collections (usually acquired from TREC). Researchers test their systems using divided collections whose documents have been previously distributed to a number of simulated peers. This practice leads to two problems: First, there is little justification in favour of the document distributions used by relevant studies and second, since different studies use different experimental testbeds, there is no common ground for comparing the solutions proposed. In this work, we contribute a number of different document testbeds for evaluating P2P IR systems. Each of these has been deduced from TREC’s WT10g collection and corresponds to different potential P2P IR application scenarios. We analyse each methodology and testbed with respect to the document distributions achieved as well as to the location of relevant items within each setting. This work marks the beginning of an effort to provide more realistic evaluation environments for P2P IR systems as well as to create a common ground for comparisons of existing and future architectures.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Oram, A. (ed.): PEER-TO-PEER: Harnessing the Power of Disruptive Technologies. O’Reilly & Associates, Inc, CA 95472, USA (2001)
OSBM LLC.: The homepage of gnutella (2003), http://www.gnutella.org/
Groove Networks: The homepage of groove networks, As viewed on March 27 (2004), http://www.groove.net/
Callan, J.: 5 – Distributed Information Retrieval. In: Advances in Information Retrieval, pp. 127–150. Kluwer Academic Publishers, Dordrecht (2000)
Lime Wire LLC.: The homepage of limewire (2003), http://www.limewire.com/
Clark, I.: The homepage of freenet project (2003), http://www.freenet.sourceforge.org/
Rowstron, A., Druschel, P.: Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, p. 329. Springer, Heidelberg (2001)
Hildrum, K., Kubiatowicz, J.D., Rao, S., Zhao, B.Y.: Distributed object location in a dynamic network. In: Proceedings of the Fourteenth ACM Symposium on Parallel Algorithms and Architectures, pp. 41–52 (2002)
Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A scalable content addressable network. In: Proceedings of ACM SIGCOMM 2001 (2001)
Cuenca-Acuna, F.M., Peery, C., Martin, R.P., Nguyen, T.D.: PlanetP: Using Gossiping to Build Content Addressable Peer-to-Peer Information Sharing Communities. In: Twelfth IEEE International Symposium on High Performance Distributed Computing (HPDC-12). IEEE Press, Los Alamitos (2003)
Bawa, M., Manku, G.S., Raghavan, P.: Sets: search enhanced by topic segmentation. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp. 306–313. ACM Press, New York (2003)
Lu, J., Callan, J.: Content-based retrieval in hybrid peer-to-peer networks. In: Proceedings of the twelfth international conference on Information and knowledge management, pp. 199–206. ACM Press, New York (2003)
Klampanos, I.A., Jose, J.M.: An architecture for information retrieval over semi-collaborating peer-to-peer networks. In: Proceedings of the 2004 ACM Symposium on Applied Computing, Nicosia, Cyprus, vol. 2, pp. 1078–1083 (2004)
Lin, K., Kondadadi, R.: A similarity-based soft clustering algorithm for documents. In: Proceedings of the 7th International Conference on Database Systems for Advanced Applications, pp. 40–47. IEEE Computer Society, Los Alamitos (2001)
Saroiu, S., Gummadi, P.K., Gribble, S.D.: A measurement study of peer-to-peer file sharing systems. In: Proceedings of Multimedia Computing and Networking 2002 (MMCN 2002), San Jose, CA, USA (2002)
Lv, Q., Cao, P., Cohen, E., Li, K., Shenker, S.: Search and replication in unstructured peer-to-peer networks. In: ICS, New York, USA (2002)
Cuenca-Acuna, F.M., Martin, R.P., Nguyen, T.D.: Planetp: Using gossiping and random replication to support reliable peer-to-peer content search and retrieval. Technical Report DCS-TR-494, Department of Computer Science, Rutgers University (2002)
Soboroff, I.: Does wt10g look like the web? In: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 423–424. ACM Press, New York (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Klampanos, I.A., Poznański, V., Jose, J.M., Dickman, P. (2005). A Suite of Testbeds for the Realistic Evaluation of Peer-to-Peer Information Retrieval Systems. In: Losada, D.E., Fernández-Luna, J.M. (eds) Advances in Information Retrieval. ECIR 2005. Lecture Notes in Computer Science, vol 3408. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31865-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-31865-1_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25295-5
Online ISBN: 978-3-540-31865-1
eBook Packages: Computer ScienceComputer Science (R0)