Nearest Query on Distributed Binary Trees Starting from a Random Node

Gargiulo, Francesco; Amato, Flora; Moscato, Vincenzo; Picariello, Antonio; Sperli’, Giancarlo

doi:10.1007/978-3-319-45880-9_20

Francesco Gargiulo¹²,
Flora Amato¹²,
Vincenzo Moscato¹²,
Antonio Picariello¹² &
…
Giancarlo Sperli’¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 649))

Included in the following conference series:

International Conference on Knowledge Engineering and the Semantic Web

623 Accesses

Abstract

This paper proposes a new distributed data structure based on binary trees to support k-nearest neighbor queries over very large databases. The indexing structure is distributed across a network of “peers”, where each one hosts a part of the tree and communication among nodes is realized by message passing. The advantages of this kind of approach are mainly two: it is possible to (i) handle a larger number of nodes and points than a single peer based architecture and (ii) to manage in an efficient way computation of multiple queries. In particular, we propose a novel version of the k-nearest neighbor algorithm that is able to start the query in a randomly chosen peer. Preliminary experiments have demonstrated that in about 65 % of cases a query, which starts in random node, does not involve the peer containing the root of the tree.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Pedersen, T., Patwardhan, S., Michelizzi, J.: WordNet: similarity: measuring the relatedness of concepts. In: Demonstration Papers at HLT-NAACL 2004, pp. 38–41. Association for Computational Linguistics, May 2004
Google Scholar
Cox, T.F., Cox, M.A.: Multidimensional Scaling. CRC Press, Boca Raton (2000)
MATH Google Scholar
Faloutsos, C., Lin, K.I.: FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. ACM SIGMOD Rec. 24(2), 163–174 (1995)
Article Google Scholar
Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann, San Francisco (2006)
MATH Google Scholar
Silaghi, B., Bhattacharjee, S., Keleher, P.J.: Query routing in the TerraDir distributed directory. In: ITCom 2002: The Convergence of Information Technologies and Communications, pp. 299–309. International Society for Optics and Photonics, July 2002
Google Scholar
Aly, M., Munich, M., Perona, P.: Distributed kd-trees for retrieval from very large image collections. In: Proceedings of the British Machine Vision Conference (BMVC), August 2011
Google Scholar
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Article Google Scholar
Jagadish, H.V., Ooi, B.C., Vu, Q.H., Zhang, R., Zhou, A.: Vbi-tree: a peer-to-peer framework for supporting multi-dimensional indexing schemes. In: 2006 Proceedings of the 22nd International Conference on Data Engineering ICDE 2006, pp. 34–34. IEEE, April 2006
Google Scholar
Jagadish, H.V., Ooi, B.C., Vu, Q.H.: Baton: a balanced tree structure for peer-to-peer networks. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 661–672. VLDB Endowment, August 2005
Google Scholar
Balakrishnan, H., Kaashoek, M.F., Karger, D., Morris, R., Stoica, I.: Looking up data in P2P systems. Commun. ACM 46(2), 43–48 (2003)
Article Google Scholar
Zhao, B.Y., Huang, L., Stribling, J., Rhea, S.C., Joseph, A.D., Kubiatowicz, J.D.: Tapestry: a resilient global-scale overlay for service deployment. IEEE J. Sel. Areas Commun. 22(1), 41–53 (2004)
Article Google Scholar
Rowstron, A., Druschel, P.: Pastry: scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, pp. 329–350. Springer, Heidelberg (2001)
Chapter Google Scholar
Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A scalable content-addressable network. ACM SIGCOMM Comput. Commun. Rev. 31(4), 161–172 (2001)
Article MATH Google Scholar
Kaashoek, M.F., Karger, D.R.: Koorde: a simple degree-optimal distributed hash table. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735, pp. 98–107. Springer, Heidelberg (2003)
Chapter Google Scholar
Plaxton, C.G., Rajaraman, R., Richa, A.W.: Accessing nearby copies of replicated objects in a distributed environment. Theor. Comput. Syst. 32(3), 241–280 (1999)
Article MathSciNet MATH Google Scholar
Tsatsanifos, G., Sacharidis, D., Sellis, T.: Index-based query processing on distributed multidimensional data. GeoInformatica 17(3), 489–519 (2013)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Italian Research Aerospace Centre, University of Naples Federico II, Naples, Italy
Francesco Gargiulo, Flora Amato, Vincenzo Moscato, Antonio Picariello & Giancarlo Sperli’

Authors

Francesco Gargiulo
View author publications
You can also search for this author in PubMed Google Scholar
Flora Amato
View author publications
You can also search for this author in PubMed Google Scholar
Vincenzo Moscato
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Picariello
View author publications
You can also search for this author in PubMed Google Scholar
Giancarlo Sperli’
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francesco Gargiulo .

Editor information

Editors and Affiliations

Leipzig University , Leipzig, Germany
Axel-Cyrille Ngonga Ngomo
Czech Technical University in Prague , Praha, Czech Republic
Petr Křemen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gargiulo, F., Amato, F., Moscato, V., Picariello, A., Sperli’, G. (2016). Nearest Query on Distributed Binary Trees Starting from a Random Node. In: Ngonga Ngomo, AC., Křemen, P. (eds) Knowledge Engineering and Semantic Web. KESW 2016. Communications in Computer and Information Science, vol 649. Springer, Cham. https://doi.org/10.1007/978-3-319-45880-9_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-45880-9_20
Published: 08 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45879-3
Online ISBN: 978-3-319-45880-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics