Skip to main content

A New Indexing Method for High Dimensional Dataset

  • Conference paper
Database Systems for Advanced Applications (DASFAA 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3453))

Included in the following conference series:

Abstract

Indexing high dimensional datasets has attracted extensive attention from many researchers in the last decade. Since R-tree type of index structures are known as suffering “curse of dimensionality” problems, Pyramid-tree type of index structures, which are based on the B-tree, have been proposed to break the curse of dimensionality. However, for high dimensional data, the number of pyramids is often insufficient to discriminate data points when the number of dimensions is high. Its effectiveness degrades dramatically with the increase of dimensionality. In this paper, we focus on one particular issue of “curse of dimensionality”; that is, the surface of a hypercube in a high dimensional space approaches 100% of the total hypercube volume when the number of dimensions approaches infinite. We propose a new indexing method based on the surface of dimensionality. We prove that the Pyramid tree technology is a special case of our method. The results of our experiments demonstrate clear priority of our novel method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. An, J., Chen, H., Furuse, K., Ishikawa, M.: The convex polyhedra technique: An index structure for high-dimensional space. In: Proc. of the 13th Australasian Database Conference, pp. 33–40 (2002)

    Google Scholar 

  2. An, J., Chen, H., Furuse, K., Ohbo, N.: CVA-file: An Index Structure for High-Dimensional Datasets. Journal of knowledge and Information Systems (to appear)

    Google Scholar 

  3. Beyer, K.S., Goldstein, J., Ramakrishnan, R., Shaft, U.: When Is "Nearest Neighbor" Meaningful. When Is "Nearest Neighbor" Meaningful, 217–235 (1999)

    Google Scholar 

  4. Berchtold, S., Keim, D., Kriegel, H.-P.: The X-tree: An Index Structure for High-Dimensional Data. In: 22nd Conf. on Very Large Database, Bombay, India, pp. 28–39 (1996)

    Google Scholar 

  5. Berchtold, S., Keim, D., Kriegel, H.-P.: The pyramid-Technique: Towards Breaking the Curse of Dimensional Data Spaces. In: Proc. ACM SIGMOD Int. Conf. Managment of Data, Seattle, pp. 142–153 (1998)

    Google Scholar 

  6. Beckmann, N., Kriegel, P.H., Schneider, R., Seeger, B.: The R*-tree: an efficient and robust access method for points and rectangles. In: Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, pp. 322–331 (1990)

    Google Scholar 

  7. Ciaccia, P., Patella, M., Zezula, P.: M-tree:An Efficient Access Method for Similarity Seach in Metric Spaces. In: Proc. 23rd Int. Conf. on Very Large Data Bases, Athens, Greece, pp. 426–435 (1997)

    Google Scholar 

  8. Guttman, A.: R-tree: a dynamic index structure for spatial searching. In: Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data, pp. 47–57 (1984)

    Google Scholar 

  9. Hellerstein, J.M., Naughton, J.F., Pfefer, A.: Generalized search trees for database systems. In: Proc. of the 21th VLDB conference, Zurich, Switzerland, September 1995, pp. 562–573 (1995)

    Google Scholar 

  10. Katayama, N., Satoh, S.: The SR-tree: An index structure for high-dimensional nearest neighbour queries. In: Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data, pp. 369–380 (1997)

    Google Scholar 

  11. Ooi, B.C., Tan, K.L., Yu, C., Bressan, S.: Indexing the Edges - A Simple and Yet Efficient Approach to High-Dimensional Indexing. In: PODS 2000, pp. 166–174 (2000)

    Google Scholar 

  12. Weber, R., Schek, J.H., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of 24th International Conference on Very Large Data Bases, pp. 194–205 (1998)

    Google Scholar 

  13. Zhang, R., Ooi, B.C., Tan, K.L.: Making the Pyramid Technique Robust to Query Types and Workloads. In: ICDE 2004, pp. 313–324 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

An, J., Chen, YP.P., Xu, Q., Zhou, X. (2005). A New Indexing Method for High Dimensional Dataset. In: Zhou, L., Ooi, B.C., Meng, X. (eds) Database Systems for Advanced Applications. DASFAA 2005. Lecture Notes in Computer Science, vol 3453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11408079_35

Download citation

  • DOI: https://doi.org/10.1007/11408079_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25334-1

  • Online ISBN: 978-3-540-32005-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics