Model Selection for Unsupervised Learning of Visual Context

Xiang, Tao; Gong, Shaogang

doi:10.1007/s11263-005-5024-8

Model Selection for Unsupervised Learning of Visual Context

Published: 01 May 2006

Volume 69, pages 181–201, (2006)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Tao Xiang¹ &
Shaogang Gong¹

173 Accesses
7 Citations
Explore all metrics

Abstract

This study addresses the problem of choosing the most suitable probabilistic model selection criterion for unsupervised learning of visual context of a dynamic scene using mixture models. A rectified Bayesian Information Criterion (BICr) and a Completed Likelihood Akaike’s Information Criterion (CL-AIC) are formulated to estimate the optimal model order (complexity) for a given visual scene. Both criteria are designed to overcome poor model selection by existing popular criteria when the data sample size varies from small to large and the true mixture distribution kernel functions differ from the assumed ones. Extensive experiments on learning visual context for dynamic scene modelling are carried out to demonstrate the effectiveness of BICr and CL-AIC, compared to that of existing popular model selection criteria including BIC, AIC and Integrated Completed Likelihood (ICL). Our study suggests that for learning visual context using a mixture model, BICr is the most appropriate criterion given sparse data, while CL-AIC should be chosen given moderate or large data sample sizes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Akaike, H. 1973. Information theory and an extension of the maximum likelihood principle. In 2nd International Symposium on Information Theory, pp. 267–28.
Bernardo, J. and Smith, A. 1994. Bayesian Theory. Wiley and Sons.
Biernacki, C., Celeux, G., and Govaert, G. 2000. Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(7):719–725.
Article Google Scholar
Bishop, C. 1995. Neural Networks for Pattern Recognition. Cambridge University Press.
Brand, M. and Kettnaker, V. 2000. Discovery and segmentation of activities in video. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):844–851.
Article Google Scholar
Brand, M., Oliver, N., and Pentland, A. 1996. Coupled hidden markov models for complex action recognition. In IEEE Conference on Computer Vision and Pattern Recognition, Puerto Rico, pp. 994–999.
Celeux, G. and Soromenho, G. 1996. An entropy criterion for assessing the number of clusters in a mixture model. J. Classification, 13:195–212.
Article MathSciNet Google Scholar
Chapelle, O., Vapnik, V., and Bengio, Y. 2002. Model selection for small sample regression. Machine Learning, 48(1):9–23.
Article Google Scholar
Cherkassky, V. and Ma, Y. 2003. Comparison of model selection for regression. Neural Computation, 15(2):1691–1714.
Article Google Scholar
Cohen, I., Sebe, N., Chen, L., Garg, A., and Huang, T. 2003. Facial expression recognition from video sequences: Temporal and static modeling. Computer Vision and Image Understanding, 91:160–187.
Article Google Scholar
Cootes, T.F., Edwards, G.J., and Taylor, C.J. 1998. Active appearance models. In European Conference on Computer Vision, Freiburg, Germany, pp. 484–498.
Dempster, A., Laird, N., and Rubin, D. 1977. Maximum-likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B, 39:1–38.
MathSciNet Google Scholar
Dempster, A., Laird, N., and Rubin, D. 1979. Comments on model selection criteria of Akaike and Schwarz. Journal of the Royal Statistical Society B, 41:276–278.
Google Scholar
Figueiredo, M. and Jain, A.K. 2002. Unsupervised learning of finite mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3):381–396.
Article Google Scholar
Fitzgerald, W. 1996. Numerical Bayesian Methods Applied to Signal Processing. Springer Verlag.
Gath, I. and Geva, B. 1989. Unsupervised optimal fuzzy clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7):773–781.
Article Google Scholar
Gong, S. and Xiang, T. 2003. Recognition of group activities using dynamic probabilistic networks. In IEEE International Conference on Computer Vision, pp. 742–749.
Haritaoglu, I., Harwood, D., and Davis, L.S. 2000. w⁴: Real-time surveillance of people and their activities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):809–830.
Article Google Scholar
Hastie, T., Tibshirani, R., and Friedman, J. 2001. The elements of statistical learning: Data mining, inference and prediction. Springer.
Hoeting, J., Madigan, D., Raftery, A., and Volinsky, C. 1995. Bayesian model averaging, a tutorial. Statistical Science, 14:382–417.
MathSciNet Google Scholar
Hongeng, S. and Nevatia, R. 2001. multi-agent event recognition. In IEEE International Conference on Computer Vision, pp. 80–86.
Hurivich, C., Shumway, R., and Tsai, C. 1990. Improved estimators of Kullback-Leibler information for autoregressive model selection in small samples. Biometrika, 77(4):709–719.
Article MathSciNet Google Scholar
Hurivich, C. and Tsai, C. 1976. Regression and time series model selection in small samples. Biometrika, 76:297–307.
Article Google Scholar
Johnson, N., Galata, A., and Hogg, D. 1998. The acquisition and use of interaction behaviour models. In IEEE Conference on Computer Vision and Pattern Recognition, Santa Barbara, USA, pp. 866–871.
Kass, R. and Raftery, A. 1995. Bayes factors. Journal of the American Statistical Association, 90:377–395.
Google Scholar
Kullback, S. 1968. Information Theory and Statistics. Dover: New York.
Google Scholar
Lange, T., Roth, V., Braun, M.L., and Buhmann, J.M. 2004. Stability-based validation of clustering solutions. Neural Computation, 16:1299–1323.
Article Google Scholar
McKenna, S., Jabri, S., Duric, Z., Rosenfeld, A., and Wechsler, H. 2000. Tracking group of people. Computer Vision and Image Understanding, 80:42–56.
Article Google Scholar
McKenna, S. and Nait-Charif, H. 2004. Learning spatial context from tracking using penalised likelihoods. In International Conference on Pattern Recognition, pp. 138–141.
Mclachlan, G. and Peel, D. 1997. Finite Mixture Models. John Wiley & Sons.
Oliver, N., Rosario, B., and Pentland, A. 2000. A bayesian computer vision system for modelling human interactions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):831–843.
Article Google Scholar
Raftery, A. 1995. Bayes model selection in social research. Sociological Methodology, 90:181–196.
Google Scholar
Rissanen, J. 1989. Stochastic Complexity in Statistical Inquiry. World Scentific.
Roberts, S. 1997. Parametric and non-parametric unsupervised cluster analysis. Pattern Recognition, 30(2):261–272.
Article Google Scholar
Roberts, S., Husmeier, D., Rezek, I., and Penny, W. 1998. Bayesian approaches to Gaussian mixture modelling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11):1133–1142.
Article Google Scholar
Schwarz, G. 1978. Estimating the dimension of a model. Annals of Statistics, 6:461–464.
MATH MathSciNet Google Scholar
Shibata, R. 1976. Selection of the order of an autoregressive model by Akaike’s Information Criterion. Biometrika, 63:117–126.
Article MATH MathSciNet Google Scholar
Stauffer, C. and Grimson, W. 2000. Learning patterns of activity using real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):747–758.
Article Google Scholar
Tian, Y., Kanade, T., and Cohn, J. 2001. Recognizing action units for facial expression analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23:97–115.
Article Google Scholar
Tipping, M. and Biship, C. 1999. Mixtures of probabilistic principal component analyzers. Neural Computation, 11:443–482.
Article Google Scholar
Wada, T. and Matsuyama, T. 2000. Multiobject behavior recognition by event driven selective attention method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):873–887.
Article Google Scholar
Xiang, T., Gong, S., and Parkinson, D. 2002. Autonomous visual events detection and classification without explicit object-centred segmentation and tracking. In British Machine Vision Conference, pp. 233–242.
Zalewski, L. and Gong, S. 2004. Modelling facial expression as probabilistic hierarchical dynamical states. Technical Report 0043, Vision Lab, Queen Mary, University of London.

Download references

Author information

Authors and Affiliations

Department of Computer Science, Queen Mary, University of London, London, E1 4NS, UK
Tao Xiang & Shaogang Gong

Authors

Tao Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Shaogang Gong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Xiang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiang, T., Gong, S. Model Selection for Unsupervised Learning of Visual Context. Int J Comput Vision 69, 181–201 (2006). https://doi.org/10.1007/s11263-005-5024-8

Download citation

Received: 26 February 2005
Revised: 14 July 2005
Accepted: 14 September 2005
Published: 01 May 2006
Issue Date: August 2006
DOI: https://doi.org/10.1007/s11263-005-5024-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Model Selection for Unsupervised Learning of Visual Context

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

Learning from imbalanced data: open challenges and future directions

Learning to Prompt for Vision-Language Models

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Model Selection for Unsupervised Learning of Visual Context

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

Learning from imbalanced data: open challenges and future directions

Learning to Prompt for Vision-Language Models

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation