Abstract
Most existing Peer to Peer (P2P) systems support name-based retrieval and have provided very limited support for the full-text search of document contents. In this paper, we present a scheme (TRES-CORE) to support content-based retrieval. First, we propose a tree structure to organize data objects in vector-format in the P2P system, which is height-balanced so that the time complexity of search can be decreased. Second, we give a simple strategy for the placement of tree’s nodes, which can guarantee both load balancing and fault tolerance. Then an efficient policy for the query is given. Besides theoretical analysis that can prove the correctness of our scheme, a simulation-based study is carried out to evaluate its performance under various scenarios finally. In this study, it shows that using this content-based retrieval scheme (TRES-CORE) is more accurate and more efficient than some other schemes in the P2P system.
This work is supported by National Science Foundation of China (NSFC) under grant No.60433040 and by China CNGI Projects under grant No.CNGI-04-12-2A, CNGI-04-12-1D.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Napstry, http://www.napstry.com/
Gnutella, http://www.gnutellaforums.com/
Stoica, I., Morris, R., Karger, D., et al.: Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications. In: Govindan. (ed.) Proc. of the ACM SIGCOMM, pp. 149–160. ACM Press, San Diego (2001)
Ratnasamy, S., Francis, P., Handley, M., et al.: A Scalable Content-Addressable Network. In: Govindan. (ed.) Proc. of the ACM SIGCOMM, pp. 161–172. ACM Press, San Diego (2001)
Rowstron, A., Druschel, P.: Pastry: Scalable, Distributed Object Location and Routing for Large-scale Peer-to-Peer Systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, pp. 329–350. Springer, Heidelberg (2001)
Zhao, B.Y., Huang, L., Stribling, J., Rhea, S.C., Joseph, A.D., Kubiatowicz, J.: Tapestry: A Resilient Global-Scale Overlay for Service Deployment. IEEE Journal on Selected Areas in Communications 22, 41–53 (2004)
Michael, W.B., Zlatko, D., Elizabeth, R.J.: Matrices, Vector Spaces, and Information Retrieval. SIAM Review 2, 335–362 (1999)
Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science 6, 391–407 (1990)
Shen, H.T., Shu, Y.F., Yu, B.: Efficient Content-Based Text Search in P2P Network. IEEE Transaction on Knowledge and Data Engineering (TKDE) (Special Issue on P2P Data Management) 7, 813–826 (2004)
Cuenca-Acuna, F.M., Nguyen, T.D.: Text-Based Content Search and Retrieval in Ad Hoc P2P Communities. Technical Report DCS-TR-483, Department of Computer Science, Rutgers University (2002)
Tran, D.A.: A Hierarchical Semantic Overlay Approach to P2P Similarity Search. In: Proceedings of USENIX Annual Technical Conference, pp. 355–358 (2005)
Tran, D.A., Hua, K.A., Do, T.T.: Zigzag: An Efficient Peer-to-Peer Scheme for Media Streaming. In: Proc. of the IEEE INFOCOM 2003. IEEE Computer and Communications Societies, New York, pp. 1283–1293 (2003)
Tang, C., Xu, Z., Mahalingam, M.: pSearch: Information Retrieval in Structured Overlays. ACM SIGCOMM Computer Communication Review 1, 89–94 (2003)
Renda, M.E., Callan, J.: The Robustness of Content-Based Search in Hierarchical Peer to Peer Network. In: Proceedings of the thirteenth ACM international conference on Informa-tion and knowledge managemen (2004)
Liu, J., Callan, J.: Content-Based Retrieval in Hybrid Peer-to-Peer Networks. In: Proceedings of the twelfth international conference on Information and knowledge management, pp. 562–570 (2003)
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: A New Data Clustering Algorithm and Its Applications. Data Mining and Knowledge Discovery 2, 141–182 (1997)
Forouzan, B.A., Gilberg, R.F.: Data Structures: A Pseudocode Approach with C++. Brooks/Cole Pub. Co. (2000)
Fisher, D.H., Xu, L., Zard, N.: Ordering Effects in Clustering. In: Proceedings of the 9th International Conferenceon Machine Learning (1992)
Widyantoro, D., Yen, J.: An Incremental Approach to Building a Cluster Hierarchy. In: Proceedings of the 2002 IEEE International Conference on Data Mining, pp. 705–708 (2002)
Wilcox-O’Hearn, B.: Experiences Deploying a Large-Scale Emergent Network. In: Proceedings of the First International Workshop on Peer-to-Peer Systems, pp. 104–110 (2002)
Comer, D.: The Ubiquitous Btree. ACM Computing Surveys 2, 121–137 (1979)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Sur-veys 31, 265–322 (1999)
Kalogeraki, V., Gunopulos, D., Zeinalipour-Yazti, D.: A Local Search Mechanism for Peer-to-Peer Networks. In: Proc. of the 11th Int’l Conf. on Information and Knowledge Management, pp. 300–307. ACM Press, New York (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jin, H., Xu, J. (2007). TRES-CORE: Content-Based Retrieval Based on the Balanced Tree in Peer to Peer Systems. In: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2007. Lecture Notes in Computer Science, vol 4671. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73940-1_23
Download citation
DOI: https://doi.org/10.1007/978-3-540-73940-1_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73939-5
Online ISBN: 978-3-540-73940-1
eBook Packages: Computer ScienceComputer Science (R0)