C2VA: Trim High Dimensional Indexes

Chen, Hanxiong; An, Jiyuan; Furuse, Kazutaka; Ohbo, Nobuo

doi:10.1007/3-540-45703-8_28

Hanxiong Chen⁶,
Jiyuan An⁶,
Kazutaka Furuse⁶ &
…
Nobuo Ohbo⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2419))

Included in the following conference series:

International Conference on Web-Age Information Management

321 Accesses
6 Citations

Abstract

Classical multi-dimensional indexes are based on data space partitioning. The effectiveness declines because the number of indexing units grows exponentially as the number of dimensions increases. Then, unfortunately, using such index structures is less effective than linear scanning of all the data. The VA-file proposed a method of coordinate approximation, observing that nearest neighbor search becomes of linear complexity in high-dimensional spaces.

In this paper we propose C²VA(Clustered Compact VA) for dimensionality reduction. We investigate and find that real datasets are rarely uniformly distributed, which is the main assumption of VA-file. Instead of approximation on all dimensions, we figure out the condition of skipping less important dimensions. This avoids the problem of generating huge index file for a large, high dimensional dataset and hence saves a lot of I/O accesses when scanning. Moreover, we guarantee that C²VA preserves the precision of bounds as in VA-file, which maximizes the efficiency gain. The conviction is found in our experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

An, J., Chen, H., Furuse, K., Ishikawa, M., Ohbo, N.:’ The Convex Polyhedra Technique: An Index Strucrture for High-dimensional Space’, Proc. of the 13th Australasian Database Conference, 2002. pp. 33–40.
Google Scholar
An, J., Chen, H., Furuse, K., Ishikawa, M., Ohbo, N.:’ A Vector-wise Dimensionality Reduction for Indexing High Dimensional Data’, Proc. Pan-Yellow-Sea Inte’l Workshop on Infor. Technologies for Network Era, pp. 135–140. 2002.
Google Scholar
Aggarwal, C., Yu, P.: Finding Generalized Projected Clusters in High Dimensional Spaces’ Proc. SIGMOD, pp. 70–81, 2000.
Google Scholar
Berchtold, S., Bohm, C., Keim, D., Kriegel, H.-P.:’ A Cost Model For Nearest Neighbor Search in High-Dimensional Data Space’, ACM PODS Symposium on Principles of Database Systems, Tucson, Arizona, pp. 78–86, 1997.
Google Scholar
Beckmann, N., Kriegel, H.-P., Schneider, R., Seeger, B.:’ The R*-tree:An efficient and Robust Access Method for Points and Rectangles’, Proc. SIGMOD, pp. 322–331, 1990.
Google Scholar
Beyer, K. S., Goldstein, J., Ramakrishnan, R., Shaft U.:’ When Is “Nearest Neighbor” Meaningful’, Proc. 7th ICDT, pp. 217–235, 1999.
Google Scholar
Fukunaga, K.:’ Statistical Pattern Recognition’, Cademic Press-October 1990.
Google Scholar
Katayama, N., Satoh, S.:’ The SR-tree: An Index Structure for High-Dimensional Nearest Neighbor Queries.’, Proc. ACM SIGMOD Int. Conf. Management of Data, Tucson, Arizona, pp. 369–380, 1997.
Google Scholar
Weber, R., Schek, H. J., Blott, S.:’A Quantitative Analysis and Performance Study for Similarity-Search Methods in high-Dimensional Spaces.’, Proc. of the VLDB conference, New York, pp. 194–205, 1998.
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Information Sciences and Electronics, University of Tsukuba, Tennoudai, Tsukuba-shi, Ibaraki, 305-8577, Japan
Hanxiong Chen, Jiyuan An, Kazutaka Furuse & Nobuo Ohbo

Authors

Hanxiong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jiyuan An
View author publications
You can also search for this author in PubMed Google Scholar
Kazutaka Furuse
View author publications
You can also search for this author in PubMed Google Scholar
Nobuo Ohbo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Information School, Renmin University of China, Beijing, 100872, China
Xiaofeng Meng
Department of Computer Science, University of California, Santa Barbara, CA, 93106-5110, USA
Jianwen Su & Yujun Wang &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, H., An, J., Furuse, K., Ohbo, N. (2002). C²VA: Trim High Dimensional Indexes. In: Meng, X., Su, J., Wang, Y. (eds) Advances in Web-Age Information Management. WAIM 2002. Lecture Notes in Computer Science, vol 2419. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45703-8_28

Download citation

DOI: https://doi.org/10.1007/3-540-45703-8_28
Published: 21 August 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44045-1
Online ISBN: 978-3-540-45703-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics