Abstract
This paper presents a novel topic model named Affine Invariant Topic Model(AITM) for generic object recognition. Abandoning the “bag of words” assumption in traditional topic models, AITM incorporates spatial structure into traditional LDA. AITM extends LDA by modeling visual words with latent affine transformations as well as latent topics, treating topics as different parts of objects and assuming a common affine transformation of visual words given a certain topic. MCMC is employed to make inference for latent variables, MCMC-EM algorithm is used to parameter estimation, and Bayesian decision rule is used to perform classification. Experiments on two challenging data sets demonstrate the efficiency of AITM.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Jordan, M.I.: Graphical Models. Statistical Science 19, 140–155 (2004)
Blei, D.M., Lafferty, J.D.: Topic Models. In: Srivastava, A., Sahami, M. (eds.) Text Mining: Theory and Applications. Taylor and Francis, London (2009)
Hofmann, T.: Unsupervised Learning by Probabilistic Latent Semantic Analysis. Machine Learning Journal 42, 177–196 (2001)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Fei-Fei, L., Perona, P.: A Bayesian Hierarchical Model for Learning Natural Scene Categories. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 2, pp. 524–531. IEEE Press, New York (2005)
Wang, X., Grimson, E.: Spatial Latent Dirichlet Allocation. In: Advances in Neural Information Processing Systems, vol. 20. MIT Press, Cambridge (2007)
Cao, L., Fei-Fei, L.: Spatially Coherent Latent Topic Model for Concurrent Object Segmentation and Classification. In: IEEE 11th International Conference on Computer Vision (2007)
Lowe, D.G.: Distinctive Image Features from Scale-invariant Keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Fergus, R., Perona, P., Zisserman, A.: Weakly Supervised Scale-invariant Learning of Models for Visual Recognition. International Journal of Computer Vision 71(3), 273–303 (2004)
Andrieu, C., De Freitas, N., Doucet, A., Jordan, M.I.: An Introduction to MCMC for Machine Learning. Machine Learning Journal 50, 5–43 (2003)
Griffiths, T., Steyvers, M.: Finding Scientific Topics. Proceedings of the National Academy of Sciences 101, 5228–5235 (2004)
Fei-Fei, L., Fergus, R., Perona, P.: Learning Generative Visual Models from Few Training Examples: an Incremental Bayesian Approach Tested on 101 Object Categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Workshop on Generative-Model Based Vision (2004)
Griffin, G., Holub, A.D., Perona, P.: The Caltech-256, Caltech Technical Report
Lazebnik, S., Schmid, C., Ponce, J.: Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 2, pp. 2169–2178. IEEE Press, New York (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, Z., Zhang, L. (2010). Affine Invariant Topic Model for Generic Object Recognition. In: Zhang, L., Lu, BL., Kwok, J. (eds) Advances in Neural Networks - ISNN 2010. ISNN 2010. Lecture Notes in Computer Science, vol 6064. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13318-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-13318-3_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13317-6
Online ISBN: 978-3-642-13318-3
eBook Packages: Computer ScienceComputer Science (R0)