Skip to main content

Nearest Query on Distributed Binary Trees Starting from a Random Node

  • Conference paper
  • First Online:
Knowledge Engineering and Semantic Web (KESW 2016)

Abstract

This paper proposes a new distributed data structure based on binary trees to support k-nearest neighbor queries over very large databases. The indexing structure is distributed across a network of “peers”, where each one hosts a part of the tree and communication among nodes is realized by message passing. The advantages of this kind of approach are mainly two: it is possible to (i) handle a larger number of nodes and points than a single peer based architecture and (ii) to manage in an efficient way computation of multiple queries. In particular, we propose a novel version of the k-nearest neighbor algorithm that is able to start the query in a randomly chosen peer. Preliminary experiments have demonstrated that in about 65 % of cases a query, which starts in random node, does not involve the peer containing the root of the tree.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Pedersen, T., Patwardhan, S., Michelizzi, J.: WordNet: similarity: measuring the relatedness of concepts. In: Demonstration Papers at HLT-NAACL 2004, pp. 38–41. Association for Computational Linguistics, May 2004

    Google Scholar 

  2. Cox, T.F., Cox, M.A.: Multidimensional Scaling. CRC Press, Boca Raton (2000)

    MATH  Google Scholar 

  3. Faloutsos, C., Lin, K.I.: FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. ACM SIGMOD Rec. 24(2), 163–174 (1995)

    Article  Google Scholar 

  4. Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann, San Francisco (2006)

    MATH  Google Scholar 

  5. Silaghi, B., Bhattacharjee, S., Keleher, P.J.: Query routing in the TerraDir distributed directory. In: ITCom 2002: The Convergence of Information Technologies and Communications, pp. 299–309. International Society for Optics and Photonics, July 2002

    Google Scholar 

  6. Aly, M., Munich, M., Perona, P.: Distributed kd-trees for retrieval from very large image collections. In: Proceedings of the British Machine Vision Conference (BMVC), August 2011

    Google Scholar 

  7. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  8. Jagadish, H.V., Ooi, B.C., Vu, Q.H., Zhang, R., Zhou, A.: Vbi-tree: a peer-to-peer framework for supporting multi-dimensional indexing schemes. In: 2006 Proceedings of the 22nd International Conference on Data Engineering ICDE 2006, pp. 34–34. IEEE, April 2006

    Google Scholar 

  9. Jagadish, H.V., Ooi, B.C., Vu, Q.H.: Baton: a balanced tree structure for peer-to-peer networks. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 661–672. VLDB Endowment, August 2005

    Google Scholar 

  10. Balakrishnan, H., Kaashoek, M.F., Karger, D., Morris, R., Stoica, I.: Looking up data in P2P systems. Commun. ACM 46(2), 43–48 (2003)

    Article  Google Scholar 

  11. Zhao, B.Y., Huang, L., Stribling, J., Rhea, S.C., Joseph, A.D., Kubiatowicz, J.D.: Tapestry: a resilient global-scale overlay for service deployment. IEEE J. Sel. Areas Commun. 22(1), 41–53 (2004)

    Article  Google Scholar 

  12. Rowstron, A., Druschel, P.: Pastry: scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, pp. 329–350. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  13. Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A scalable content-addressable network. ACM SIGCOMM Comput. Commun. Rev. 31(4), 161–172 (2001)

    Article  MATH  Google Scholar 

  14. Kaashoek, M.F., Karger, D.R.: Koorde: a simple degree-optimal distributed hash table. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735, pp. 98–107. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  15. Plaxton, C.G., Rajaraman, R., Richa, A.W.: Accessing nearby copies of replicated objects in a distributed environment. Theor. Comput. Syst. 32(3), 241–280 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  16. Tsatsanifos, G., Sacharidis, D., Sellis, T.: Index-based query processing on distributed multidimensional data. GeoInformatica 17(3), 489–519 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Gargiulo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Gargiulo, F., Amato, F., Moscato, V., Picariello, A., Sperli’, G. (2016). Nearest Query on Distributed Binary Trees Starting from a Random Node. In: Ngonga Ngomo, AC., Křemen, P. (eds) Knowledge Engineering and Semantic Web. KESW 2016. Communications in Computer and Information Science, vol 649. Springer, Cham. https://doi.org/10.1007/978-3-319-45880-9_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45880-9_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45879-3

  • Online ISBN: 978-3-319-45880-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics