Abstract
In the recent decades, retrieval systems deployed over peer-to-peer (P2P) overlay networks have been investigated as an alternative to centralised search engines. Although modern search engines provide efficient document retrieval, they possess several drawbacks. In order to alleviate their problems, P2P Information Retrieval (P2PIR) systems provide an alternative architecture to the traditional centralised search engine. Users and creators of web content in such networks have full control over what information they wish to share as well as how they share it. The semi-structured P2P architecture has been proposed where the underlying approach organises similar document in a peer, often using clustering techniques, and promotes willing peers as super peers (or hubs) to traffic queries to appropriate peers with relevant content. However, no systematic evaluation study has been performed on such architectures. In this paper, we study the performance of three cluster-based semi-structured P2PIR models and explain the effectiveness of several important design considerations and parameters on retrieval performance, as well as the robustness of these types of network.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Shokouhi, M., Si, L.: Federated search. Foundations and Trends in Information Retrieval 5(1), 1–102 (2011)
Kulathuramaiyer, N., Balke, W.T.: Restricting the view and connecting the dots - dangers of a web search engine monopoly. J. UCS 12(12), 1731–1740 (2006)
Mowshowitz, A., Kawaguchi, A.: Assessing bias in search engines. Inf. Process. Manage. 38(1), 141–156 (2002)
Tene, O.: What google knows: Privacy and internet search engines. Utah Law Review 2008(4), 1434–1490 (2009)
Lewandowski, D., Wahlig, H., Meyer-Bautor, G.: The freshness of web search engine databases. J. Inf. Sci. 32(2), 131–148 (2006)
Bergman, M.K.: The deep web: Surfacing hidden value. Journal of Electronic Publishing 7(1) (2001)
Lu, J., Callan, J.: Content-based retrieval in hybrid peer-to-peer networks. In: CIKM, pp. 199–206 (2003)
Nottelmann, H., Fuhr, N.: Comparing different architectures for query routing in peer-to-peer networks. In: Lalmas, M., MacFarlane, A., Rüger, S.M., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds.) ECIR 2006. LNCS, vol. 3936, pp. 253–264. Springer, Heidelberg (2006)
Klampanos, I.A., Jose, J.M.: An evaluation of a cluster-based architecture for peer-to-peer information retrieval. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 380–391. Springer, Heidelberg (2007)
Androutsellis-Theotokis, S., Spinellis, D.: A survey of peer-to-peer content distribution technologies. ACM Comput. Surv. 36(4), 335–371 (2004)
Yang, B., Garcia-Molina, H.: Designing a super-peer network. In: ICDE, pp. 49–60 (2003)
Watts, D., Strogatz, S.: Collective dynamics of ’small-world’ networks. Nature 393(6684), 440–442 (1998)
Kleinberg, J.: The small-world phenomenon: An algorithmic perspective. In: Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, STOC ’00, pp. 163–170. ACM, New York (2000)
Fuhr, N.: A decision-theoretic approach to database selection in networked IR. ACM Trans. Inf. Syst. 17(3), 229–249 (1999)
Callan, J.P., Lu, Z., Croft, W.B.: Searching distributed collections with inference networks. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’95, pp. 21–28. ACM, New York (1995)
Puppin, D., Silvestri, F., Perego, R., Baeza-Yates, R.: Tuning the capacity of search engines: Load-driven routing and incremental caching to reduce and balance the load. ACM Trans. Info. Syst. (TOIS) 28(2), 5 (2010)
Richardson, S., Cox, I.J.: Estimating global statistics for unstructured P2P search in the presence of adversarial peers. In: Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’14, pp. 203–212. ACM, New York (2014)
Klampanos, I.A., Poznański, V., Jose, J.M., Fischer, F.: A suite of testbeds for the realistic evaluation of peer-to-peer information retrieval systems. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 38–51. Springer, Heidelberg (2005)
Ounis, I., Lioma, C., Macdonald, C., Plachouras, V.: Research directions in terrier: a search engine for advanced retrieval on the web. CEPIS Upgrade Journal 8(1) (2007)
Jones, K.S., Walker, S., Robertson, S.E.: A probabilistic model of information retrieval: Development and comparative experiments. Inf. Process. Manage. 36(6), 779–808 (2000)
Shaw, J.A., Fox, E.A.: Combination of multiple searches. In: Text REtrieval Conference, pp. 243–252 (1994)
Lee, J.H.: Analyses of multiple evidence combination. SIGIR Forum 31(SI), 267–276 (1997)
Xu, J., Callan, J.P.: Effective retrieval with distributed collections. In: SIGIR, pp. 112–120 (1998)
Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. Technical Report 00–034, University of Minnesota (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Alkhawaldeh, R.S., Jose, J.M. (2015). Experimental Study on Semi-structured Peer-to-Peer Information Retrieval Network. In: Mothe, J., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2015. Lecture Notes in Computer Science(), vol 9283. Springer, Cham. https://doi.org/10.1007/978-3-319-24027-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-24027-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24026-8
Online ISBN: 978-3-319-24027-5
eBook Packages: Computer ScienceComputer Science (R0)