An encoding-based dual distance tree high-dimensional index

Zhuang, Yi; Zhuang, YueTing; Wu, Fei

doi:10.1007/s11432-008-0104-3

An encoding-based dual distance tree high-dimensional index

Published: 08 August 2008

Volume 51, pages 1401–1414, (2008)
Cite this article

Science in China Series F: Information Sciences Aims and scope Submit manuscript

Yi Zhuang¹,
YueTing Zhuang² &
Fei Wu²

40 Accesses
Explore all metrics

Abstract

The paper proposes a novel symmetrical encoding-based index structure, which is called EDD-tree (for encoding-based dual distance tree), to support fast k-nearest neighbor (k-NN) search in high-dimensional spaces. In the EDD-tree, all data points are first grouped into clusters by a k-means clustering algorithm. Then the uniform ID number of each data point is obtained by a dual-distance-driven encoding scheme, in which each cluster sphere is partitioned twice according to the dual distances of start-and centroid-distance. Finally, the uniform ID number and the centroid-distance of each data point are combined to get a uniform index key, the latter is then indexed through a partition-based B⁺-tree. Thus, given a query point, its k-NN search in high-dimensional spaces can be transformed into search in a single dimensional space with the aid of the EDD-tree index. Extensive performance studies are conducted to evaluate the effectiveness and efficiency of our proposed scheme, and the results demonstrate that this method outperforms the state-of-the-art high-dimensional search techniques such as the X-tree, VA-file, iDistance and NB-tree, especially when the query radius is not very large.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Near-Optimal Partial Linear Scan for Nearest Neighbor Search in High-Dimensional Space

A Novel High-Dimensional Index Method Based on the Mathematical Features

Improving the Performance of High-Dimensional kNN Retrieval through Localized Dataspace Segmentation and Hybrid Indexing

References

Bohm C, Berchtold S, Keim D. Searching in high-dimensional spaces: index structures for improving the performance of multimedia databases. ACM Comput Surv, 2001, 33(3): 322–373
Article Google Scholar
Guttman A. R-tree: a dynamic index structure for spatial searching. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. Boston: ACM Press, 1984. 47–54
Google Scholar
Beckmann N, Kriegel H P, Schneider R, et al. The R*-tree: an efficient and robust access method for points and rectangles. In: Proceedings of ACM SIGMOD International Conference on Management of Data. Atlantic City: SIGMOD Record, 1990, 19(2). 322–331
Google Scholar
Berchtold S, Keim D A, Kriegel H P. The X-tree: an index structure for high-dimensional data. In: Proceedings of the 22th International Conference on Very Large Data Bases. India: Morgan Kaufmann, 1996. 28–37
Google Scholar
Weber R, Schek H, Blott S. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the 24th International Conference on Very Large Data Bases. New York: Morgan Kaufmann Publishers, 1998. 194–205
Google Scholar
Berchtold S, Bohm C, Kriegel H P, et al. Independent quantization: an index compression technique for high-dimensional data spaces. In: Proceedings of the 16th International Conference on Data Engineering. USA: IEEE Computer Society, 2000. 577–588
Google Scholar
Fonseca M J, Jorge J A. NB-Tree: an indexing structure for content-based retrieval in large databases. In: Proceedings of the 8th International Conference on Database Systems for Advanced Applications. Kyoto: IEEE Computer Society, 2003. 267–274
Google Scholar
Jagadish H V, Ooi B C, Tan K L, et al. iDistance: an adaptive B⁺-tree based indexing method for nearest neighbor search. ACM Trans Database Syst, 2005, 30(2): 364–397
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer Science & Information Engineering, Zhejiang Gongshang University, Hangzhou, 310018, China
Yi Zhuang
College of Computer Science, Zhejiang University, Hangzhou, 310026, China
YueTing Zhuang & Fei Wu

Authors

Yi Zhuang
View author publications
You can also search for this author in PubMed Google Scholar
YueTing Zhuang
View author publications
You can also search for this author in PubMed Google Scholar
Fei Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to YueTing Zhuang.

Additional information

Supported by the key program of the National Natural Science Foundation of China (Grant No. 60533090), the National Natural Science Fund for Distinguished Young Scholars (Grant No. 60525108), and China-America Academic Digital Library Project

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhuang, Y., Zhuang, Y. & Wu, F. An encoding-based dual distance tree high-dimensional index. Sci. China Ser. F-Inf. Sci. 51, 1401–1414 (2008). https://doi.org/10.1007/s11432-008-0104-3

Download citation

Received: 24 October 2006
Accepted: 30 November 2007
Published: 08 August 2008
Issue Date: October 2008
DOI: https://doi.org/10.1007/s11432-008-0104-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An encoding-based dual distance tree high-dimensional index

Abstract

Access this article

Similar content being viewed by others

Near-Optimal Partial Linear Scan for Nearest Neighbor Search in High-Dimensional Space

A Novel High-Dimensional Index Method Based on the Mathematical Features

Improving the Performance of High-Dimensional kNN Retrieval through Localized Dataspace Segmentation and Hybrid Indexing

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An encoding-based dual distance tree high-dimensional index

Abstract

Access this article

Similar content being viewed by others

Near-Optimal Partial Linear Scan for Nearest Neighbor Search in High-Dimensional Space

A Novel High-Dimensional Index Method Based on the Mathematical Features

Improving the Performance of High-Dimensional kNN Retrieval through Localized Dataspace Segmentation and Hybrid Indexing

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation