Skip to main content
Log in

Composite Distance Transformation for Indexing and k-Nearest-Neighbor Searching in High-Dimensional Spaces

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Due to the famous dimensionality curse problem, search in a high-dimensional space is considered as a “hard” problem. In this paper, a novel composite distance transformation method, which is called CDT, is proposed to support a fast k-nearest-neighbor (k-NN) search in high-dimensional spaces. In CDT, all (n) data points are first grouped into some clusters by a k-Means clustering algorithm. Then a composite distance key of each data point is computed. Finally, these index keys of such n data points are inserted by a partition-based B+-tree. Thus, given a query point, its k-NN search in high-dimensional spaces is transformed into the search in the single dimensional space with the aid of CDT index. Extensive performance studies are conducted to evaluate the effectiveness and efficiency of the proposed scheme. Our results show that this method outperforms the state-of-the-art high-dimensional search techniques, such as the X-Tree, VA-file, iDistance and NB-Tree.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Christian Böhm, Stefan Berchtold, Daniel Keim. Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Computing Surveys, 2001, 33(3): 322–373.

    Article  Google Scholar 

  2. Guttman A. R-tree: A dynamic index structure for spatial searching. In Proc. the ACM SIGMOD Int. Conf. Management Data, Boston, USA, 1984, pp. 47–54.

  3. Beckmann N, Kriegel H-P, Schneider R, Seeger B. The R *-tree: An efficient and robust access method for points and rectangles. In Proc. ACM SIGMOD Int. Conf. Management Data, Atlantic, USA, 1990, pp. 322–331.

  4. Berchtold S, Keim D A, Kriegel H P. The X-tree: An index structure for high-dimensional data. In Proc. 22nd Int. Conf. Very Large Data Bases, India, 1996, pp. 28–37.

  5. Katamaya N, Satoh S. The SR-tree: An index structure for high-dimensional nearest neighbor queries. In Proc. ACM SIGMOD Int. Conf. Management of Data, Arizona, USA, 1997, pp. 32–42.

  6. Weber R, Schek H, Blott S. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In Proc. 24th Int. Conf. Very Large Data Bases, New York, USA, 1998, pp. 194–205.

  7. Berchtold S, Bohm C, Kriegel H P et al. Independent quantization: An index compression technique for high-dimensional data spaces. In Proc. 16th Int. Conf. Data Engineering, San Diego, USA, 2000, pp. 577–588.

  8. Fonseca M J, Jorge J A. Indexing high-dimensional data for content-based retrieval in large databases. In Proc. the 8th Int. Conf. Database Systems for Advanced Applications, Kyoto, Japan, 2003, pp. 267–274.

  9. Jagadish H V, Ooi B C, Tan K L et al. iDistance: An adaptive B+-tree based indexing method for nearest neighbor search. ACM Trans. Data Base Systems, 2005, 30(2): 364–397.

    Article  Google Scholar 

  10. The UCI KDD Archive. http://www.kdd.ics.uci.edu, 2002.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi Zhuang.

Additional information

Partially supported by the National Natural Science Foundation of China (Grant No. 60533090), National Science Fund for Distinguished Young Scholars (Grant No. 60525108), the National Grand Fundamental Research 973 Program of China (Grant No. 2002CB312101), Science and Technology Project of Zhejiang Province (Grant Nos. 2005C13032, 2005C11001-05) and China-America Academic Digital Library Project (see www.cadal.zju.edu.cn).

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhuang, Y., Zhuang, YT. & Wu, F. Composite Distance Transformation for Indexing and k-Nearest-Neighbor Searching in High-Dimensional Spaces. J Comput Sci Technol 22, 208–217 (2007). https://doi.org/10.1007/s11390-007-9027-5

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-007-9027-5

Keywords

Navigation