Abstract
Dimensionality reduction techniques such as principal component analysis and factor analysis are used to discover a linear mapping between high-dimensional data samples and points in a lower-dimensional subspace. Previously, Frey and Jojic introduced transformation-invariant component analysis (TCA) to learn a linear mapping, invariant to a set of known form of global transformations. However, parameter estimation in that model using the previously-proposed expectation maximization (EM) algorithm required scalar operations in the order of N 2 where N is the dimensionality of each training example. This is prohibitive for many applications of interest such as modeling mid-to large-size images, where, for instance, N may be as high as 786432 (512×512 RGB image). In this paper, we present an efficient algorithm that reduces the computational requirements to order of Nlog N. With this speedup, we show the effectiveness of transformation-invariant component analysis in various applications including tracking, learning video textures, clustering, object recognition and object detection in images. Software for TCA can be downloaded from http://www.psi.toronto.edu/fastTCA.htm.
Similar content being viewed by others
References
Black, M., Fleet, D., & Yacoob, Y. (1998). A framework for modeling appearance change in image sequences. In Proceedings IEEE international conference on computer vision.
Black, M., & Jepson, A. (1998). Eigentracking: robust matching and tracking of articulated objects using a view-based representation. International Journal of Computer Vision 26(1), 63–84.
Chudova, D., Gaffney, S., & Smyth, P. (2003). Probabilistic models for joint clustering and time-warping of multidimensional curves. In Proceedings of the 19th conference on uncertainty in artificial intelligence.
Everitt, B. (1984). An introduction to latent variable models. New York: Chapman and Hall.
Felzenszwalb, P., Huttenlocher, D., & Kleinberg, J. (2003). Fast algorithms for large-state-space hmms with applications to web usage analysis. In Advances in neural information processing systems.
Frey, B., Colmenarez, A., & Huang, T. (1998). Mixtures of local linear subspaces for face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Frey, B., & Jojic, N. (1999). Transformed component analysis: Joint estimation of spatial transformations and image components. In International conference on computer vision.
Frey, B., & Jojic, N. (2002). Fast, large-scale transformation-invariant clustering. In Advances in neural information processing systems. Cambridge: MIT Press.
Frey, B., & Jojic, N. (2003). Transformation-invariant clustering using the em algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(1), 1–17.
Frey, B., Jojic, N., & Kannan, A. (2003). Learning appearance and transparency manifolds of occluded objects in layers. In IEEE conference on computer vision and pattern recognition.
Ghahramani, Z., & Hinton, G. (1996). The em algorithm for mixtures of factor analyzers (Technical Report CRG-TR-96-1). University of Toronto.
Gray, A., & Moore, A. (2000). ‘n-body’ problems in statistical learning. In Advances in Neural Information Processing Systems. Cambridge: MIT Press.
Hinton, G., Dayan, P., & Revow, M. (1997). Modeling the manifolds of images of handwritten digits. IEEE Transactions on Neural Networks 8(1), 65–74.
Jolliffe, I. (1986). Principal component analysis. New York: Springer.
Kannan, A., Jojic, N., & Frey, B. (2003). Fast transformation-invariant factor analysis. In Advances in neural information processing systems. Cambridge: MIT Press.
Learned-Miller, E. (2006). Data driven image models through continuous joint alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(2), 236–250.
Li, S., Zhu, L., Zhang, Z., & Zhang, H. (2002). Learning to detect multi-view faces in real-time. In Proceedings of the 2nd international conference on development and learning.
Neumaier, A., & Schneider, T. (2001). Estimation of parameters and eigenmodes of multivariate autoregressive models. ACM Transactions on Math Software 27(1), 27–57.
Press, W., Teukolsky, S., Vetterling, W., & Flannery, B. (1992). Numerical recipes in C. Cambridge: Cambridge University Press.
Schödl, A., Szeliski, R., Salesin, D., & Essa, I. (2000). Video textures. In Proceedings of SIGGRAPH.
Turk, M., & Pentland, A. (1991). Face recognition using eigenfaces. In Proceedings of IEEE conference on computer vision and pattern recognition.
Wolberg, G., & Zokai, S. (2000). Robust image registration using log-polar transform. In Proceedings of IEEE international conference on image processing.
Yang, C., Duraiswami, R., Gumerov, N., & Davis, L. (2003). Improved fast gauss transform and efficient kernel density estimation. In Proceedings of international conference on computer vision.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kannan, A., Jojic, N. & Frey, B.J. Fast Transformation-Invariant Component Analysis. Int J Comput Vis 77, 87–101 (2008). https://doi.org/10.1007/s11263-007-0094-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-007-0094-4