research-article

A concurrent k-NN search algorithm for R-tree

Authors:
Jagat Sesh Challa

Advanced Data Analytics and Parallel Technologies Laboratory, Department of Computer Science & Information Systems, Birla Institute of Technology & Science - Pilani, Pilani Campus, Pilani, India

Advanced Data Analytics and Parallel Technologies Laboratory, Department of Computer Science & Information Systems, Birla Institute of Technology & Science - Pilani, Pilani Campus, Pilani, India
View Profile

,
Poonam Goyal

Advanced Data Analytics and Parallel Technologies Laboratory, Department of Computer Science & Information Systems, Birla Institute of Technology & Science - Pilani, Pilani Campus, Pilani, India

Advanced Data Analytics and Parallel Technologies Laboratory, Department of Computer Science & Information Systems, Birla Institute of Technology & Science - Pilani, Pilani Campus, Pilani, India
View Profile

,
S. Nikhil

Advanced Data Analytics and Parallel Technologies Laboratory, Department of Computer Science & Information Systems, Birla Institute of Technology & Science - Pilani, Pilani Campus, Pilani, India

Advanced Data Analytics and Parallel Technologies Laboratory, Department of Computer Science & Information Systems, Birla Institute of Technology & Science - Pilani, Pilani Campus, Pilani, India
View Profile

,
Sundar Balasubramaniam

Advanced Data Analytics and Parallel Technologies Laboratory, Department of Computer Science & Information Systems, Birla Institute of Technology & Science - Pilani, Pilani Campus, Pilani, India

Advanced Data Analytics and Parallel Technologies Laboratory, Department of Computer Science & Information Systems, Birla Institute of Technology & Science - Pilani, Pilani Campus, Pilani, India
View Profile

,
Navneet Goyal

Advanced Data Analytics and Parallel Technologies Laboratory, Department of Computer Science & Information Systems, Birla Institute of Technology & Science - Pilani, Pilani Campus, Pilani, India

Advanced Data Analytics and Parallel Technologies Laboratory, Department of Computer Science & Information Systems, Birla Institute of Technology & Science - Pilani, Pilani Campus, Pilani, India
View Profile

Compute '15: Proceedings of the 8th Annual ACM India ConferenceOctober 2015Pages 123–128https://doi.org/10.1145/2835043.2835050

Published:29 October 2015Publication History

Compute '15: Proceedings of the 8th Annual ACM India Conference

Pages 123–128

ABSTRACT

k-nearest neighbor (k-NN) search is one of the commonly used query in database systems. It has its application in various domains like data mining, decision support systems, information retrieval, multimedia and spatial databases, etc. When k-NN search is performed over large data sets, spatial data indexing structures such as R-trees are commonly used to improve query efficiency. The best-first k-NN (BF-kNN) algorithm is the fastest known k-NN over R-trees. We present CBF-kNN, a concurrent BF-kNN for R-trees, which is the first concurrent version of k-NN we know of for R-trees. CBF-kNN uses one of the most efficient concurrent priority queues known as mound. CBF-kNN overcomes the concurrency limitations of priority queues by using a tree-parallel mode of execution. CBF-kNN has an estimated speedup of O(p/k) for p threads. Experimental results on various real datasets show that the speedup in practice is close to this estimate.

References

T. Cover and P. Hart. 1967. Nearest neighbor pattern classification. IEEE Trans. Inf. Theo. 13,1(Sep 1967), 21--27. Google ScholarDigital Library
N. Bhatia and Vandana. 2010. Survey of Nearest Neighbor Techniques. International Journal of Computer Science & Information Security (IJCSIS'10) 8, 2 (2010), 302--305.Google Scholar
A. Guttman. 1984. R-trees: a dynamic index structure for spatial searching. SIGMOD Rec.14, 2 (June 1984), 47--57. Google ScholarDigital Library
Y. Manolopoulos, et al. 2005. R-Trees: Theory and Applications (Advanced Information and Knowledge Processing). Springer-Verlag New York, Inc., NJ, USA. Google ScholarDigital Library
Jon Louis Bentley. 1975. Multidimensional binary search trees used for associative searching. Commun. ACM 18, 9 (Sep 1975), 509--517. Google ScholarDigital Library
N. Roussopoulos, S. Kelley, and F. Vincent. 1995. Nearest neighbor queries. SIGMOD Rec. 24, 2 (May 1995), 71--79. Google ScholarDigital Library
K. L. Cheung and A. W. Fu. 1998. Enhanced nearest neighbour search on the R-tree. SIGMOD Rec. 27, 3 (Sep 1998), 16--21. Google ScholarDigital Library
G. R. Hjaltason and H. Samet. 1999. Distance browsing in spatial databases. ACM Trans. Database Syst. 24, 2 (June 1999), 265--318 Google ScholarDigital Library
J. H. Friedman, J. L. Bentley, and R. A. Finkel. 1977. An Algorithm for Finding Best Matches in Logarithmic Expected Time. ACM Trans. Math. Soft. 3, 3 (Sep 1977), 209--226. Google ScholarDigital Library
R. F. Sproull. 1991. Refinements to Nearest-Neighbor Search in k-Dimensional Trees. Algorithmica 6, (1991), 579--589.Google Scholar
N. Sismanis, N. Pitsianis, and X. Sun. 2012. Parallel search of k-nearest neighbors with synchronous operations. In Proceedings of 2012 IEEE Conference on High Performance Extreme Computing (HPEC), IEEE Computer Society, Washington D.C., USA, 1--6.Google Scholar
F. Gieseke, et al. 2014. Buffer k-d Trees: Processing Massive Nearest Neighbor Queries on GPUs. In Proc. of 31st International Conference on Machine Learning, Beijing, China, 2014, 1--9.Google Scholar
T. Hering. 2013. Parallel Execution of kNN-Queries on in-memory K-D Trees. In Proc. of 15th GI Symposium on Business, Technology & Web (BTW'13), Magdeburg, Germany, 257--266.Google Scholar
A. N. Papadopoulos and Y. Manolopoulos. 1998. Similarity query processing using disk arrays. In Proc. of the 1998 ACM SIGMOD international conference on Management of data (SIGMOD '98), ACM, New York, NY, USA, 225--236. Google ScholarDigital Library
Y. Gao, et al. 2006. Efficient Parallel Processing for K-Nearest-Neighbor Search in Spatial Databases. Lect. Notes in Comp. Science 3984 (2006),39--48. Google ScholarDigital Library
C. Bohm and F. Krebs. 2002. High Performance Data Mining Using the Nearest Neighbor Join. In Proc. of IEEE International Conf. on Data Mining (ICDM), Japan, 43--50. Google ScholarDigital Library
Y. Liu and M. Spear. 2012. Mounds: Array-Based Concurrent Priority Queues. In Proc. of 41st International Conference on Parallel Processing (ICPP '12). IEEE Computer Society, Washington, DC, USA, 1--10. Google ScholarDigital Library
D. Alistarh, et al. 2015. The SprayList: a scalable relaxed priority queue. In Proc. of 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2015). ACM, NY, 11--20. Google ScholarDigital Library
M. Herlihy and N. Shavit. 2008. The Art of Multiprocessor Programming. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. Google ScholarDigital Library
VampirTrace Library, http://www.tudresden.de/die_tu_dresden/zentrale_einrichtungen/zih/forschung/projekte/vampirtraceGoogle Scholar
V. Springel, et al. 2005. Simulations of the formation, evolution and clustering of galaxies and quasars. Nature 435, 7042, 629--636.Google Scholar
SUVnet-Trace data, http://wirelesslab.sjtu.edu.cn.Google Scholar
M. Kaul, B. Yang, and C. S. Jensen. 2013. Building Accurate 3D Spatial Networks to Enable Next Generation Intelligent Transportation Systems. In Proc. of 14th International Conference on Mobile Data Management (IEEE MDM'13), Milan, Italy, 137--14. Google ScholarDigital Library

Recommendations

Enhanced nearest neighbour search on the R-tree

Multimedia databases usually deal with huge amounts of data and it is necessary to have an indexing structure such that efficient retrieval of data can be provided. R-Tree with its variations, is a commonly cited indexing method. In this paper we ...
Read More
Efficient k-Nearest Neighbors Search in High Dimensions Using MapReduce
BDCLOUD '15: Proceedings of the 2015 IEEE Fifth International Conference on Big Data and Cloud Computing

Finding the k-Nearest Neighbors (kNN) of a query object for a given dataset S is a primitive operation in many application domains. kNN search is very costly, especially many applications witness a quick increase in the amount and dimension of data to ...
Read More
Performance of R-Tree with Slim-Down and Reinsertion Algorithm
ICSAP '10: Proceedings of the 2010 International Conference on Signal Acquisition and Processing

With the development of information technology, the amount of the multimedia data become more and more. The growth of these data brings the need for more effective methods in retrieval. The multimedia retrieval systems always index these data on the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

Compute '15: Proceedings of the 8th Annual ACM India Conference
October 2015
142 pages
ISBN:9781450336505
DOI:10.1145/2835043

Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 October 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Data mining
R-tree
best first search
concurrent data structures
k-nearest neighbor search
mounds
priority queues
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate114of622submissions,18%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 322
  Total Downloads
- Downloads (Last 12 months)18
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A concurrent k-NN search algorithm for R-tree

Compute '15: Proceedings of the 8th Annual ACM India Conference

ABSTRACT

References

Cited By

Recommendations

Enhanced nearest neighbour search on the R-tree

Efficient k-Nearest Neighbors Search in High Dimensions Using MapReduce

Performance of R-Tree with Slim-Down and Reinsertion Algorithm

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A concurrent k-NN search algorithm for R-tree

Compute '15: Proceedings of the 8th Annual ACM India Conference

ABSTRACT

References

Cited By

Recommendations

Enhanced nearest neighbour search on the R-tree

Efficient k-Nearest Neighbors Search in High Dimensions Using MapReduce

Performance of R-Tree with Slim-Down and Reinsertion Algorithm

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media