Abstract
We will show that the hierarchical linear subspace method is a tree that divides the distances between the subspaces. By doing so, the data is divided into disjoint entities. The asymptotic upper bound estimation of the maximum applicable number of subspaces is logarithmically constrained by the number of represented elements and their dimension. The search in such a tree starts at the subspace with the lowest dimension. In this subspace, the set of all possible similar images is determined. In the next subspace, additional metric information corresponding to a higher dimension is used to reduce this set. The distances between the subspaces correspond to the values represented by the difference between the mean distance of all the points in one space and a corresponding mean distance of the objects in a subspace. The theoretical estimation of temporal complexity of the algorithmic is logarithmic. The costs are equivalent to the search costs in an tree plus the additional costs of the dimension of the data space.
Similar content being viewed by others
References
Andoni, A., Dater, M., Indyk, P., Immorlica, N., & Mirrokni, V. (2006). Locality-sensitive hashing using stable distributions. In T. Darrell, P. Indyk, & G. Shakhnarovich (Eds.), Nearest neighbor methods in learning and vision: Theory and practice (Chapter 4). Cambridge: MIT.
Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modeling. In R. Baeza-Yates, & B. Ribeiro-Neto (Eds.), Modern information retrieval (Chapter 2, pp. 19–71). Reading: Addison-Wesley.
Blei, D. M., & Jordan, M. I. (2003). Modeling annotated data. In Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval (pp. 127–134).
Böhm, C., Berchtold, S., & Kei, D. A. K. (2001). Searching in high-dimensional spaces—index structures for improving the performance of multimedia databases. ACM Computing Surveys, 33(3), 322–373.
Burt, P. J., & Adelson, E. H. (1983). The laplacian pyramidas a compact image code. IEEE Transactions on Communications, COM-31(4), 532–540.
Chen, Y., & Wang, J. Z. (2004). Image categorization by learning and reasoning with regions. Journal of Machine Learning Research, 5, 913–939.
Ciaccia, P., & Patella, M. P. Z. (1997). M-tree: An efficient access method for similarity search in metric spaces. In VLDB (pp. 426–435).
Ciaccia, P., & Patella, M. (2002). Searching in metric spaces with user-defined and approximate distances. ACM Transactions on Database Systems, 27(4), 398–437.
Datta, R., Joshi, D., & Wang, J. (2008). Image retrieval: Ideas, influences, and trands of the new age. ACM Computing Surveys, 40(2), 1–60.
Dunckley, L. (2003). Multimedia databases, an object-rational approach. Reading: Addison Wesley.
Faloutsos, C. (1999). Modern information retrieval. In R. Baeza-Yates, & B. Ribeiro-Neto (Eds.), Modern information retrieval (Chapter 12, pp. 345–365). Reading: Addison-Wesley.
Faloutsos, C., Barber, R., Flickner, M., Hafner, J., Niblack, W., Petkovic, D., et al. (1994). Efficient and effective querying by image content. Journal of Intelligent Information Systems, 3(3/4), 231–262.
Flickner, M., Sawhney, H., Niblck, W., Ashley, J., Huang, Q., Dom, B., et al. (1995). Query by image and video content the QBIC system. IEEE Computer, 28(9), 23–32.
Fonseca, M. J., & Jorge, J. A. (2003). Indexing high-dimensional data for content-based retrieval in large databases. In Proceedings of the 8th international conference on database systems for advanced applications (pp. 267–274).
Gonzales, R. C., & Woods, E. W. (2001). Digital image processing (2nd ed.). Englewood Cliffs: Prentice Hall.
Hove, L.-J. (2004). Extending image retrieval systems with a thesaurus for shapes. Master’s thesis, Institute for Information and Media Sciences—University of Bergen.
Jeon, J., Lavrenko, V., & Manmatha, R. (2003). Automatic image annotation and retrieval using cross-media relevance models. In Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval (pp. 119–126).
Kim, H.-G., Moreau, N., & Sikora, T. (2005). MPEG-7 audio and beyond: Audio content indexing and retrieval. New York: Wiley.
Li, J., & Wang, J. (2003). Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Transactions on Pattern Analysis and Machine Learning, 25(9), 1075–1088.
Li, J., & Wang, J. (2004). Studying digital imagery of ancient paintings by mixtures of stochastic models. IEEE Transactions on Pattern Analysis and Machine Learning, 12(2), 1–15.
Lv, Q., Josephson, W., Wang, Z., Charikar, M., & Li, K. (2007). Multi-probe lsh: Efficient indexing for high-dimensional similarity search. In Proceedings of the 33rd international conference on very large data bases (pp. 950–961).
Manjunath, B., Salembier, P., & Sikora, T. (2002). Introduction to MPEG-7: Multimedia content description interface. New York: Wiley.
Mirmehdi, M., & Periasamy, R. (2001). Cbir with perceptual region features. In BMVC.
Niblack, W., Barber, R., Equitz, W., Flickner, M., Glasman, E. H., Petkovic, D., et al. (1993). The qbic project: Querying images by content, using color, texture, and shape. In Storage and retrieval for image and video databases (SPIE) (pp. 173–187).
Nistér, D., & Stewénius, H. (2006). Scalable recognition with a vocabulary tree. In IEEE conference on computer vision and pattern recognition (CVPR) (Vol. 2, pp. 2161–2168).
Olafsson, A., Jonsson, B., & Amsaleg, L. (2008). Dynamic behavior of balanced nv-trees. In International workshop on content-based multimedia indexing conference proceedings, IEEE (pp. 174–183).
Quack, T., Mönich, U., Thiele, L., & Manjunath, B. S. (2004). Cortina: A system for large-scale, content-based web image retrieval. In Proceedings of the 12th annual ACM international conference on multimedia (pp. 508–511).
Resnikoff, H. L. (1989). The illusion of reality. New York: Springer.
Sakurai, Y., Yoshikawa, M., Uemura, S., & Kojima, H. (2002). Spatial indexing of high-dimensional data based on relative approximation. VLDB Journal, 11(2), 93–108.
Smeulders, A., Worring, M., Santini, S., Gupta, A., & Jain, R. (2000). Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12), 1349–1380.
Wang, J., Li, J., & Wiederhold, G. (2001). Simplicity: Semantics-sensitive integrated matching for picture libraries. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(9), 947–963.
Wichert, A. (2008). Content-based image retrieval by hierarchical linear subspace method. Journal of Intelligent Information Systems, 31(1), 85–107.
Zaniolo, C., Ceri, S., Snodgrass, R. T., Zicari, R., & Faloutsos, C. (1997). Advanced database systems. San Francisco: Morgan Kaufmann.
Acknowledgements
The author would like to thank for the permission to use the data-set for experimental tests purposes to TM Deserno, Dept. of Medical Informatics, RWTH Aachen, Germany. The author would like to gratefully acknowledge two anonymous reviewers for their valuable suggestions. The author would also like to thank Patricia Lima, for the help during the manuscript preparation.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wichert, A., Teixeira, P., Santos, P. et al. Subspace tree: high dimensional multimedia indexing with logarithmic temporal complexity. J Intell Inf Syst 35, 495–516 (2010). https://doi.org/10.1007/s10844-009-0104-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-009-0104-9