Distance approximation techniques to reduce the dimensionality for multimedia databases

Kim, Yongkwon; Chung, Chin-Wan; Lee, Seok-Lyong; Kim, Deok-Hwan

doi:10.1007/s10115-010-0322-z

Distance approximation techniques to reduce the dimensionality for multimedia databases

Regular Paper
Published: 09 July 2010

Volume 28, pages 227–248, (2011)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Yongkwon Kim¹,
Chin-Wan Chung¹,
Seok-Lyong Lee² &
…
Deok-Hwan Kim³

135 Accesses
6 Citations
3 Altmetric
Explore all metrics

Abstract

Recently, databases have been used to store multimedia data such as images, maps, video clips, and music clips. In order to search them, they should be represented by various features, which are composed of high-dimensional vectors. As a result, the dimensionality of data is increased considerably, which causes ‘the curse of dimensionality’. The increase of data dimensionality causes poor performance of index structures. To overcome the problem, the research on the dimensionality reduction has been conducted. However, some reduction methods do not guarantee no false dismissal, while others incur high computational cost. This paper proposes dimensionality reduction techniques that guarantee no false dismissal while providing efficiency considerable by approximating distances with a few values. To provide the no false dismissal property, approximated distances should always be smaller than original distances. The Cauchy–Schwarz inequality and two trigonometrical equations are used as well as the dimension partitioning technique is applied to approximate distances in such a way to reduce the difference between the approximated distance and the original distance. As a result, the proposed techniques reduce the candidate set of a query result for efficient query processing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Approximate Indexing in High-Dimensional Feature Spaces

High-dimensional similarity searches using query driven dynamic quantization and distributed indexing

Article 11 April 2019

Neighborhood Selection for Dimensionality Reduction

References

Agrawal R, Faloutsos C, Swami AN (1993) Efficient similarity search in sequence databases. In: Proceedings of the International Conference of Foundations of Data Organization and Algorithms, pp 69–84
Beckmann N, Kriegel HP, Schneider R, Seeger B (1990) The r*-tree: an efficient and robust access method for points and rectangles. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp 322–331
Berchtold S, Keim DA, Kriegel HP (1996) The x-tree : An index structure for high-dimensional data. In: Proceedings of International Conference on Very Large Data Bases, pp 28–39
Cha GH, Chung CW (2002) The gc-tree: a high-dimensional index structure for similarity search in image databases. IEEE Trans Multimed 4(2): 235–247
Article Google Scholar
Cha GH, Zhu X, Petkovic P, Chung CW (2002) An efficient indexing method for nearest neighbor searches in high-dimensional image databases. IEEE Trans Multimed 4(1): 76–87
Article Google Scholar
Donoho DL (2000) High-dimensional data analysis: the curses and blessings of dimensionality. In: AMS Conference Mathematical Challenges of the 21st Century
Egecioglu O, Ferhatosmanoglu H (2004) Dimensionality reduction and similarity computation by inner product approximations. IEEE Trans Knowl Data Eng 16(6): 714–726
Article Google Scholar
Faloutsos C (1996) Searching multimedia databases by content. Kluwer Academic Publishers, Dordrecht
MATH Google Scholar
Fei-Fei L, Fergus R, Perona P (2004) Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In CVPR 2004, Workshop on Generative-Model Based Vision
Filho RFS, Traina AJM, Jr., CT, Faloutsos C (2001) Similarity search without tears: the OMNI family of all-purpose access methods. In: Proceedings of the seventeenth International Conference on Data Engineering, pp 623–630
Geusebroek JM, Burghouts GJ, Smeulders AWM (2005) The Amsterdam library of object images. Int J Comput Vis 61(1): 103–112
Article Google Scholar
Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset, TR-7694, California Institute of Technology
Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of the ACM SIGMOD international conference on Management of Data, pp 47–57
Huang Z, Sun S, Wang W (2009) Efficient mining of skyline objects in subspaces over data streams, Knowledge and Information Systems, Online published
Kanth KVR, Agrawal D, Abbadi AE, Singh A (1999) Dimensionality reduction for similarity searching in dynamic databases. Comput Vis Image Underst 75(1-2): 59–72
Article Google Scholar
Katayama N, Satoh S (1997) The sr-tree: an index structure for high-dimensional nearest neighbor queries. In: Proceedings ACM SIGMOD International Conference on Management of Data, pp 369–380
Katayama N, Satoh S (2000) Application of multidimensional indexing methods to massive processing of multimedia information. Syst Comput Jpn 31(13): 31–41
Article Google Scholar
Keogh EJ, Chakrabarti K, Mehrotra S, Pazzani MJ (2001) Locally adaptive dimensionality reduction for indexing large time series databases. In: Proceedings of the ACM SIGMOD international conference on Management of data, pp 369-380
Keogh EJ, Chakrabarti K, Pazzani MJ, Mehrotra S (2001) Dimensionality reduction for fast similarity search in large time series databases. Knowl Inform Syst 3(3): 263–286
Article MATH Google Scholar
Lin S, Chen S, Wu W, Chen C (2009) Parameter determination and feature selection for back-propagation network by particle swarm optimization. Knowl Inform Syst 21(2): 249–266
Article Google Scholar
Lowe D (2003) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2): 91–110
Article Google Scholar
Martinez JM (2002) Mpeg-7: overview of mpeg-7 description tools, part 2. IEEE Multimed 9(3): 83–93
Article Google Scholar
Sakurai Y, Yoshikawa M, Uemura S, Kojima H (2000) The A-tree: an index structure for high-dimensional spaces using relative approximation. In: Proceedings of the International Conference on Very Large Data Bases, pp 516–526
Song G, Cui B, Zheng B, Xie K, Yang D (2009) Accelerating sequence searching: dimensionality reduction method. Knowl Inform Syst 20(3): 301–322
Article Google Scholar
UCI Machine Learning repository (1998) ftp://ftp.ics.uci.edu/pub/machine-learning-databases/optdigits/
Vu K, Hua K, Cheng H, Lang SD (2008) Bounded approximation: a new criterion for dimensionality reduction approximation in similarity search. IEEE Trans Knowl Data Eng 20(6): 768–783
Article Google Scholar
Vu K, Hua KA, Cheng H, Lang SD (2006) A non-linear dimensionality-reduction technique for fast similarity search in large databases. In: Proceedings of the ACM SIGMOD international conference on Management of data, pp 527–538
Wang JZ, Boujemaa N, Bimbo AD, Geman D, Hauptmann AG, Tesic J (2006) Diversity in multimedia information retrieval research. In: Proceedings of the ACM international workshop on Multimedia information retrieval, pp 5–12
Weber R, Schek HJ, Blott S (1998) A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings International Conference on Very Large Data Bases, pp 194–205
White DA, Jain R (1996) Similarity indexing with the ss-tree. In: Proceedings of the International Conference on Data Engineering, pp 516–523
Wu YL, Agrawal D, Abbadi AE (2000) A comparison of DFT and DWT based similarity search in time-series databases. In: Proceedings of the ACM CIKM International Conference on Information and Knowledge Management, pp 488–495
Yi BK, Faloutsos C (2000) Fast time sequence indexing for arbitrary lp norms. In: Proceedings of the International Conference on Very Large Data Bases, pp 385–394

Download references

Author information

Authors and Affiliations

Division of Computer Science, KAIST, Daejeon, 305-701, Korea
Yongkwon Kim & Chin-Wan Chung
School of Industrial and Information Engineering, Hankuk University of Foreign Studies, Yongin-si, Gyeonggi-do, 449-701, Korea
Seok-Lyong Lee
School of Electronics and Electrical Engineering, Inha University, Incheon, 402-751, Korea
Deok-Hwan Kim

Authors

Yongkwon Kim
View author publications
You can also search for this author in PubMed Google Scholar
Chin-Wan Chung
View author publications
You can also search for this author in PubMed Google Scholar
Seok-Lyong Lee
View author publications
You can also search for this author in PubMed Google Scholar
Deok-Hwan Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chin-Wan Chung.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, Y., Chung, CW., Lee, SL. et al. Distance approximation techniques to reduce the dimensionality for multimedia databases. Knowl Inf Syst 28, 227–248 (2011). https://doi.org/10.1007/s10115-010-0322-z

Download citation

Received: 14 August 2009
Revised: 17 April 2010
Accepted: 25 June 2010
Published: 09 July 2010
Issue Date: July 2011
DOI: https://doi.org/10.1007/s10115-010-0322-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Distance approximation techniques to reduce the dimensionality for multimedia databases

Abstract

Access this article

Similar content being viewed by others

Efficient Approximate Indexing in High-Dimensional Feature Spaces

High-dimensional similarity searches using query driven dynamic quantization and distributed indexing

Neighborhood Selection for Dimensionality Reduction

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Distance approximation techniques to reduce the dimensionality for multimedia databases

Abstract

Access this article

Similar content being viewed by others

Efficient Approximate Indexing in High-Dimensional Feature Spaces

High-dimensional similarity searches using query driven dynamic quantization and distributed indexing

Neighborhood Selection for Dimensionality Reduction

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation