Abstract
Peer-to-peer systems constitute a promising solution for deploying novel applications, such as distributed image retrieval. Efficient search over widely distributed multimedia content requires techniques for distributed retrieval based on generic metric distance functions. In this paper, we propose a framework for distributed metric-based similarity search, where each participating peer stores its own data autonomously. In order to establish a scalable and efficient search mechanism, we adopt a super-peer architecture, where super-peers are responsible for query routing. We propose the construction of metric routing indices suitable for distributed similarity search in metric spaces. Furthermore, we present a query routing algorithm that exploits pruning techniques to selectively direct queries to super-peers and peers with relevant data. We study the performance of the proposed framework using both synthetic and real data demonstrate its scalability over a wide range of experimental setups.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Banaei-Kashani, F., Shahabi, C.: SWAM: a family of access methods for similarity-search in peer-to-peer data networks. In: Proceedings of CIKM 2004, pp. 304–313 (2004)
Batko, M., Falchi, F., Lucchese, C., Novak, D., Perego, R., Rabitti, F., Sedmidubský, J., Zezula, P.: Building a web-scale image similarity search system. Multimedia Tools Appl. 47(3), 599–629 (2010)
Batko, M., Gennaro, C., Zezula, P.: A Scalable Nearest Neighbor Search in P2P Systems. In: Ng, W.S., Ooi, B.-C., Ouksel, A.M., Sartori, C. (eds.) DBISP2P 2004. LNCS, vol. 3367, pp. 79–92. Springer, Heidelberg (2005)
Batko, M., Novak, D., Falchi, F., Zezula, P.: On scalability of the similarity search in the world of peers. In: Proceedings of International Conference on Scalable Information Systems (InfoScale), vol. 20 (2006)
Bawa, M., Condie, T., Ganesan, P.: LSH forest: self-tuning indexes for similarity search. In: Proceedings of WWW 2005, pp. 651–660 (2005)
Bharambe, A.R., Agrawal, M., Seshan, S.: Mercury: supporting scalable multi-attribute range queries. In: Proceedings of SIGCOMM 2004, pp. 353–366 (2004)
Chavez, E., Navarro, G., Baeza-Yates, R., Marroquin, J.L.: Searching in metric spaces. ACM Computing Surveys (CSUR) 33(3), 273–321 (2001)
Ciaccia, P., Patella, M.: Bulk loading the M-tree. In: Proceedings of Australasian Database Conference (ADC), pp. 15–26 (1998)
Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), pp. 426–435 (1997)
Crainiceanu, A., Linga, P., Gehrke, J., Shanmugasundaram, J.: P-tree: a P2P index for resource discovery applications. In: Proceedings of WWW 2004 (2004)
Crainiceanu, A., Linga, P., Machanavajjhala, A., Gehrke, J., Shanmugasundaram, J.: P-ring: An efficient and robust p2p range index structure. In: Proceedings of SIGMOD, pp. 223–234 (2007)
Datta, A., Hauswirth, M., John, R., Schmidt, R., Aberer, K.: Range queries in trie-structured overlays. In: Proceedings of P2P 2005, pp. 57–66 (2005)
Dohnal, V., Sedmidubsky, J., Zezula, P., Novak, D.: Similarity searching: Towards bulk-loading peer-to-peer networks. In: Proceedings of International Workshop on Similarity Search and Applications (SISAP), pp. 87–94 (2008)
Doulkeridis, C., Vlachou, A., Kotidis, Y., Vazirgiannis, M.: Peer-to-peer similarity search in metric spaces. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), pp. 986–997 (2007)
Doulkeridis, C., Vlachou, A., Kotidis, Y., Vazirgiannis, M.: Efficient range query processing in metric spaces over highly distributed data. Distributed and Parallel Databases 26(2-3), 155–180 (2009)
Doulkeridis, C., Vlachou, A., Nørvåg, K., Kotidis, Y., Vazirgiannis, M.: Efficient search based on content similarity over self-organizing p2p networks. Peer-to-Peer Networking and Applications 3(1), 67–79 (2010)
Falchi, F., Gennaro, C., Zezula, P.: A Content–Addressable Network for Similarity Search in Metric Spaces. In: Moro, G., et al. (eds.) DBISP2P 2005 and DBISP2P 2006. LNCS, vol. 4125, pp. 98–110. Springer, Heidelberg (2007)
Ganesan, P., Bawa, M., Garcia-Molina, H.: Online balancing of range-partitioned data with applications to peer-to-peer systems. In: Proceedings of VLDB 2004, pp. 444–455 (2004)
Hjaltason, G.R., Samet, H.: Index-driven similarity search in metric spaces. ACM Transactions on Database Systems (TODS) 28(4), 517–580 (2003)
Jagadish, H.V., Ooi, B.C., Tan, K.-L., Yu, C., Zhang, R.: iDistance: An adaptive B + -tree based indexing method for nearest neighbor search. ACM Transactions on Database Systems (TODS) 30(2), 364–397 (2005)
Jagadish, H.V., Ooi, B.C., Vu, Q.H.: Baton: a balanced tree structure for peer-to-peer networks. In: Proceedings of VLDB 2005, pp. 661–672 (2005)
Jagadish, H.V., Ooi, B.C., Vu, Q.H., Zhang, R., Zhou, A.: VBI-tree: A peer-to-peer framework for supporting multi-dimensional indexing schemes. In: Proceedings of ICDE 2006, vol. 34 (2006)
Kalnis, P., Ng, W.S., Ooi, B.C., Tan, K.-L.: Answering similarity queries in peer-to-peer networks. Inf. Syst. 31(1), 57–72 (2006)
Liu, B., Lee, W.-C., Lee, D.L.: Supporting complex multi-dimensional queries in P2P systems. In: Proceedings of ICDCS 2005, pp. 155–164 (2005)
Novak, D., Batko, M., Zezula, P.: Large-scale similarity data management with distributed metric index. In: Information Processing and Management (2011)
Novak, D., Zezula, P.: M-Chord: a scalable distributed similarity search structure. In: Proceedings of International Conference on Scalable Information Systems (InfoScale), vol. 19 (2006)
Ntarmos, N., Pitoura, T., Triantafillou, P.: Range Query Optimization Leveraging Peer Heterogeneity in DHT Data Networks. In: Moro, G., Bergamaschi, S., Joseph, S., Morin, J.-H., Ouksel, A.M. (eds.) DBISP2P 2005 and DBISP2P 2006. LNCS, vol. 4125, pp. 111–122. Springer, Heidelberg (2007)
Ratnasamy, S., Francis, P., Handley, M., Karp, R., Schenker, S.: A scalable content-addressable network. In: Proceedings of Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM), pp. 161–172 (2001)
Shen, H.T., Shu, Y., Yu, B.: Efficient semantic-based content search in P2P network. IEEE Trans. Knowl. Data Eng. 16(7), 813–826 (2004)
Shu, Y., Ooi, B.C., Tan, K.-L., Zhou, A.: Supporting multi-dimensional range queries in peer-to-peer systems. In: Proceedings of P2P 2005, pp. 173–180 (2005)
Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. In: Proceedings of Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM), pp. 149–160 (2001)
Vlachou, A., Doulkeridis, C., Kotidis, Y.: Peer-to-Peer Similarity Search Based on M-Tree Indexing. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds.) DASFAA 2010. LNCS, vol. 5982, pp. 269–275. Springer, Heidelberg (2010)
Vlachou, A., Doulkeridis, C., Mavroeidis, D., Vazirgiannis, M.: Designing a Peer-to-Peer Architecture for Distributed Image Retrieval. In: Boujemaa, N., Detyniecki, M., Nürnberger, A. (eds.) AMR 2007. LNCS, vol. 4918, pp. 182–195. Springer, Heidelberg (2008)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Vlachou, A., Doulkeridis, C., Kotidis, Y. (2012). Metric-Based Similarity Search in Unstructured Peer-to-Peer Systems. In: Hameurlain, A., Küng, J., Wagner, R. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems V. Lecture Notes in Computer Science, vol 7100. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28148-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-28148-8_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28147-1
Online ISBN: 978-3-642-28148-8
eBook Packages: Computer ScienceComputer Science (R0)