Exploiting Sample-Data Distributions to Reduce the Cost of Nearest-Neighbor Searches with Kd-Trees

Talbert, Doug; Fisher, Doug

doi:10.1007/3-540-48412-4_34

Doug Talbert⁷ &
Doug Fisher⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1642))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

712 Accesses
1 Citations

Abstract

We present KD-DT, an algorithm that uses a decision-tree-inspired measure to build a kd-tree for low cost nearest-neighbor searches. The algorithm starts with a “standard” kd-tree and uses searches over a training set to evaluate and improve the structure of the kd-tree. In particular, the algorithm builds a tree that better insures that a query and its nearest neighbors will be in the same subtree(s), thus reducing the cost of subsequent search.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aha, D. W.: A Study of Instance-Based Algorithms for Supervised Learning Tasks. Technical Report TR-90-42, University of California, Irvine (1990)
Google Scholar
Aurenhammer F.: Voronoi Diagrams—A Survey of a Fundamental Geometric Data Structure. ACM Computing Surveys 23 (1991) 345–405.
Article Google Scholar
Bentley, J. L.: Multidimensional Divide and Conquer. Communications of the ACM 23 (1980) 214–229
Article MATH MathSciNet Google Scholar
Dasarathy, B. V. (ed.): Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques, IEEE Computer Society Press, Los Alamitos (1991)
Google Scholar
Deng, K., Moore, A. W.: Multiresolution instance-based learning. In: IJCAI-95, Morgan Kaufmann, San Mateo (1995)
Google Scholar
Fortune, S.: Voronoi Diagrams and Delaunay Triangulations. In: Hwang, F., Du, D. Z. (eds.): Computing in Euclidean Geometry (Second Edition), World Scientific, Singapore (1995)
Google Scholar
Fortune, S.: Voronoi Diagrams and Delaunay Triangulations. In: Goodman, J. E., O’Rourke, J. (eds.): Discrete and Computational Geometry, CRC Press, New York (1997)
Google Scholar
Friedman, J. H., Bentley, J. L., Finkel, R. A.: An Algorithm for Finding Best Matches in Logarithmic Expected Time. ACM Transactions on Mathematical Software 3 (1977) 209–226
Article MATH Google Scholar
Marling, C., Petot, G., Sterling, L.: A CBR/RBR Hybrid for Designing Nutritional Menus. In: Freuder, G. (ed.): Multimodal Reasoning: Papers from the 1998 AAAI Spring Symposium, AAAI Press, Menlo Park (1998)
Google Scholar
Moore, A. W.: Efficient Memory-Based Learning for Robot Control. Technical Report No. 209 (PhD. Thesis), Computer Laboratory, University of Cambridge (1991)
Google Scholar
Sproull, R. F.: Refinements to Nearest-Neighbor Searching in K-Dimensional Trees. J. Algorithmica 6 (1991) 579–589
Article MATH MathSciNet Google Scholar
Talbert, D., Fisher, D.: OPT-KD: An Algorithm for Optimizing Kd-Trees. In: Bratko, I., Dzeroski, S. (eds.): Machine Learning: Proceedings of the Sixteenth International Conference, Morgan Kaufmann, San Francisco (1999)
Google Scholar
U. S. Dept. of Agriculture, Agricultural Research Service: USDA Nutrient Database for Standard Reference, Release 12. Nutrient Data Laboratory Home Page, http://www.nal.usda.gov/fnic/foodcomp (1998)
Wettschereck, D., Aha, D. W., Mohri, T.: A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms. Artificial Intelligence Review 11 (1997) 273–314
Article Google Scholar
Yianilos, P. N.: Data Structures and Algorithms for Nearest Neighbor Search in General Metric Spaces. In: Proceedings of the Fourth ACM-SIAM Symposium on Discrete Algorithms (1993)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Vanderbilt University, Nashville, TN, 37235, USA
Doug Talbert & Doug Fisher

Authors

Doug Talbert
View author publications
You can also search for this author in PubMed Google Scholar
Doug Fisher
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mathematics, Imperial College, Huxley Building 180 Queen’s Gate, London, SW7 2BZ, UK
David J. Hand
Leiden Institute for Advanced Computer Science, Leiden University, 2300, RA Leiden, The Netherlands
Joost N. Kok
Berkeley Initiative in Soft Computing, University of California at Berkeley, 329 Soda Hall, Berkeley, CA, 94720, USA
Michael R. Berthold

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Talbert, D., Fisher, D. (1999). Exploiting Sample-Data Distributions to Reduce the Cost of Nearest-Neighbor Searches with Kd-Trees. In: Hand, D.J., Kok, J.N., Berthold, M.R. (eds) Advances in Intelligent Data Analysis. IDA 1999. Lecture Notes in Computer Science, vol 1642. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48412-4_34

Download citation

DOI: https://doi.org/10.1007/3-540-48412-4_34
Published: 08 July 1999
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66332-4
Online ISBN: 978-3-540-48412-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics