Abstract
This paper proposes a new distributed data structure based on binary trees to support k-nearest neighbor queries over very large databases. The indexing structure is distributed across a network of “peers”, where each one hosts a part of the tree and communication among nodes is realized by message passing. The advantages of this kind of approach are mainly two: it is possible to (i) handle a larger number of nodes and points than a single peer based architecture and (ii) to manage in an efficient way computation of multiple queries. In particular, we propose a novel version of the k-nearest neighbor algorithm that is able to start the query in a randomly chosen peer. Preliminary experiments have demonstrated that in about 65 % of cases a query, which starts in random node, does not involve the peer containing the root of the tree.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Pedersen, T., Patwardhan, S., Michelizzi, J.: WordNet: similarity: measuring the relatedness of concepts. In: Demonstration Papers at HLT-NAACL 2004, pp. 38–41. Association for Computational Linguistics, May 2004
Cox, T.F., Cox, M.A.: Multidimensional Scaling. CRC Press, Boca Raton (2000)
Faloutsos, C., Lin, K.I.: FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. ACM SIGMOD Rec. 24(2), 163–174 (1995)
Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann, San Francisco (2006)
Silaghi, B., Bhattacharjee, S., Keleher, P.J.: Query routing in the TerraDir distributed directory. In: ITCom 2002: The Convergence of Information Technologies and Communications, pp. 299–309. International Society for Optics and Photonics, July 2002
Aly, M., Munich, M., Perona, P.: Distributed kd-trees for retrieval from very large image collections. In: Proceedings of the British Machine Vision Conference (BMVC), August 2011
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Jagadish, H.V., Ooi, B.C., Vu, Q.H., Zhang, R., Zhou, A.: Vbi-tree: a peer-to-peer framework for supporting multi-dimensional indexing schemes. In: 2006 Proceedings of the 22nd International Conference on Data Engineering ICDE 2006, pp. 34–34. IEEE, April 2006
Jagadish, H.V., Ooi, B.C., Vu, Q.H.: Baton: a balanced tree structure for peer-to-peer networks. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 661–672. VLDB Endowment, August 2005
Balakrishnan, H., Kaashoek, M.F., Karger, D., Morris, R., Stoica, I.: Looking up data in P2P systems. Commun. ACM 46(2), 43–48 (2003)
Zhao, B.Y., Huang, L., Stribling, J., Rhea, S.C., Joseph, A.D., Kubiatowicz, J.D.: Tapestry: a resilient global-scale overlay for service deployment. IEEE J. Sel. Areas Commun. 22(1), 41–53 (2004)
Rowstron, A., Druschel, P.: Pastry: scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, pp. 329–350. Springer, Heidelberg (2001)
Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A scalable content-addressable network. ACM SIGCOMM Comput. Commun. Rev. 31(4), 161–172 (2001)
Kaashoek, M.F., Karger, D.R.: Koorde: a simple degree-optimal distributed hash table. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735, pp. 98–107. Springer, Heidelberg (2003)
Plaxton, C.G., Rajaraman, R., Richa, A.W.: Accessing nearby copies of replicated objects in a distributed environment. Theor. Comput. Syst. 32(3), 241–280 (1999)
Tsatsanifos, G., Sacharidis, D., Sellis, T.: Index-based query processing on distributed multidimensional data. GeoInformatica 17(3), 489–519 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Gargiulo, F., Amato, F., Moscato, V., Picariello, A., Sperli’, G. (2016). Nearest Query on Distributed Binary Trees Starting from a Random Node. In: Ngonga Ngomo, AC., Křemen, P. (eds) Knowledge Engineering and Semantic Web. KESW 2016. Communications in Computer and Information Science, vol 649. Springer, Cham. https://doi.org/10.1007/978-3-319-45880-9_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-45880-9_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45879-3
Online ISBN: 978-3-319-45880-9
eBook Packages: Computer ScienceComputer Science (R0)