Constructing Visual Models with a Latent Space Approach

Monay, Florent; Quelhas, Pedro; Gatica-Perez, Daniel; Odobez, Jean-Marc

doi:10.1007/11752790_7

Florent Monay²⁰,
Pedro Quelhas²⁰,
Daniel Gatica-Perez²⁰ &
…
Jean-Marc Odobez²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3940))

Included in the following conference series:

International Statistical and Optimization Perspectives Workshop "Subspace, Latent Structure and Feature Selection"

4035 Accesses
1 Citations

Abstract

We propose the use of latent space models applied to local invariant features for object classification. We investigate whether using latent space models enables to learn patterns of visual co-occurrence and if the learned visual models improve performance when less labeled data are available. We present and discuss results that support these hypotheses. Probabilistic Latent Semantic Analysis (PLSA) automatically identifies aspects from the data with semantic meaning, producing unsupervised soft clustering. The resulting compact representation retains sufficient discriminative information for accurate object classification, and improves the classification accuracy through the use of unlabeled data when less labeled training data are available. We perform experiments on a 7-class object database containing 1776 images.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Hierarchical Combination of Semantic Visual Words for Image Classification and Clustering

Image Representation

Multiple Instance Classification in the Image Domain

References

Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM Press, New York (1999)
Google Scholar
Blei, D., Andrew, Y., Jordan, M.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1020 (2003)
MATH Google Scholar
Buntine, W.: Variational extensions to em and multinomial pca. In: Proc. of Europ. Conf. on Machine Learning, Helsinki (August 2002)
Google Scholar
Duygulu, P., Barnard, K., Freitas, N., Forsyth, D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)
Chapter Google Scholar
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 42, 177–196 (2001)
Article MATH Google Scholar
Keller, M., Bengio, S.: Theme topic mixture model: A graphical model for document representation. IDIAP Research Report, IDIAP-RR-04-05 (January 2004)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2003)
Article Google Scholar
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. In: Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, Madison (June 2003)
Google Scholar
Monay, F., Gatica-Perez, D.: On image auto-annotation with latent space models. In: Proc. of ACM Int. Conf. on Multimedia, Berkeley (November 2003)
Google Scholar
Monay, F., Gatica-Perez, D.: PLSA-based image auto-annotation: Constraining the latent space. In: Proc. ACM Int. Conf. on Multimedia, New York (October 2004)
Google Scholar
Opelt, A., Fussenegger, M., Pinz, A., Auer, P.: Weak hypotheses and boosting for generic object detection and recognition. In: Proc. of IEEE Europ. Conf. on Computer Vision, Prague (May 2004)
Google Scholar
Quelhas, P., Monay, F., Odobez, J.-M., Gatica-Perez, D., Tuytelaars, T., Gool, L.V.: Modeling scenes with local descriptors and latent aspects. In: Proc. of IEEE Int. Conf. on Computer Vision, Beijing (October 2005)
Google Scholar
Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering object categories in image collections. Technical report, Dept. of Engineering Science, University of Oxford (2005)
Google Scholar
Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: Proc. of IEEE Int. Conf. on Computer Vision, Nice (October 2003)
Google Scholar
Tuytelaars, T., Van Gool, L.: Content-based image retrieval based on local affinely invariant regions. In: Proc. of Visual 1999, Amsterdam (June 1999)
Google Scholar
Vailaya, A., Figueiredo, M., Jain, A., Zhang, H.J.: Image classification for content-based indexing. IEEE Trans. on Image Processing 10, 117–130 (2001)
Article MATH Google Scholar
Willamowski, J., Arregui, D., Csurka, G., Dance, C.R., Fan, L.: Categorizing nine visual classes using local appearance descriptors. In: Proc. of ICPR Workshop on Learning for Adaptable Visal Systems, Cambridge (August 2004)
Google Scholar

Download references

Author information

Authors and Affiliations

IDIAP Research Institute, 1920, Martigny, Switzerland
Florent Monay, Pedro Quelhas, Daniel Gatica-Perez & Jean-Marc Odobez

Authors

Florent Monay
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Quelhas
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Gatica-Perez
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Marc Odobez
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ISIS Research Group, University of Southampton, Southampton, U.K.
Craig Saunders
Dept. of Knowledge Technologies, Jozef Stefan Institute, Jamova 39, 1000, Ljubljana, Slovenia
Marko Grobelnik
School of Electronics and Computer Science, University of Southampton, Building 1, Highfield Campus, SO17 1BJ, Southampton, UK
Steve Gunn
The Centre for Computational Statistics and Machine Learning Department of Computer Science, University College London, Gower St., WC1E 6BT, London, UK
John Shawe-Taylor

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Monay, F., Quelhas, P., Gatica-Perez, D., Odobez, JM. (2006). Constructing Visual Models with a Latent Space Approach. In: Saunders, C., Grobelnik, M., Gunn, S., Shawe-Taylor, J. (eds) Subspace, Latent Structure and Feature Selection. SLSFS 2005. Lecture Notes in Computer Science, vol 3940. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11752790_7

Download citation

DOI: https://doi.org/10.1007/11752790_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34137-6
Online ISBN: 978-3-540-34138-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics