Sparse inverse covariance learning of conditional Gaussian mixtures for multiple-output regression

Kim, Minyoung

doi:10.1007/s10489-015-0691-9

Sparse inverse covariance learning of conditional Gaussian mixtures for multiple-output regression

Published: 12 July 2015

Volume 44, pages 17–29, (2016)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Minyoung Kim¹

346 Accesses
4 Citations
Explore all metrics

Abstract

We consider the task of multiple-output regression where both input and output are high-dimensional. Due to the limited amount of training samples compared to data dimensions, properly imposing loose statistical dependency in learning a regression model is crucial for reliable prediction accuracy. The sparse inverse covariance learning of conditional Gaussian random fields has been recently emerging to achieve this goal, shown to exhibit superior performance to non-sparse approaches. However, one of its main drawbacks is the strong assumption of linear Gaussianity in modeling the input-output relationship. For certain application domains, the assumption might be too restricted and less powerful in representation, and consequently, prediction based on the wrong models can result in suboptimal performance. In this paper, we extend the idea of sparse learning to a non-Gaussian model, especially the powerful conditional Gaussian mixture. For this latent-variable model, we propose a novel sparse inverse covariance learning algorithm based on the expectation-maximization lower-bound optimization technique. It is shown that each M-step reduces to solving the regular sparse inverse covariance estimation of linear Gaussian models, in conjunction with estimating sparse logistic regression. We demonstrate the improved prediction performance of the proposed algorithm over exisitng methods on several datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Article 30 August 2016

Partial Least Squares Methods: Partial Least Squares Correlation and Partial Least Square Regression

Overfitting, Model Tuning, and Evaluation of Prediction Performance

Notes

It states that two random variables z _i and z _j are conditionally independent given a set of other variables z _C (C⊆{1, … , k}∖{i, j}) if and only if nodes i and j are separated by the node set C.
http://mocap.cs.cmu.edu

References

Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723
Article MATH MathSciNet Google Scholar
Andrew G, Gao J (2007) Scalable training of l1-regularized log-linear models. In: International conference on machine learning
Banerjee O, Ghaoui LE, d’Aspremont A (2008) Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J Mach Learn Res 9:485–516
MATH MathSciNet Google Scholar
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imag Sci 2(1):183–202
Article MATH MathSciNet Google Scholar
Bishop CM, Svensén M (2003) Bayesian hierarchical mixtures of experts. Uncertainty in artificial intelligence
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc 39:1–38
MATH MathSciNet Google Scholar
Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical LASSO. Biostatistics 9(3):432–441
Article MATH Google Scholar
Greiner R, Zhou W (2002) Structural extension to logistic regression: discriminative parameter learning of belief net classifiers. In: Proceedings of annual meeting of the American Association for Artificial Intelligence
Hammersley JM, Clifford P (1971) Markov field on finite graphs and lattices. Unpublished
Harman HH (1976) Modern factor analysis. University of Chicago Press
Hong T (2012) Global energy forecasting competition. http://www.gefcom.org
Hsieh CJ, Sustik MA, Dhillon IS, Ravikumar P (2011) Sparse inverse covariace matrix estimation using quadratic approximation. Neural Inf Process Sys
Jordan MI, Jacobs RA (1994) Hierarchical mixtures of experts and the EM algorithm. Neural Comput 6(2):181–214
Article Google Scholar
Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. The MIT Press
Lafferty J, McCallum A, Pereira F (2001) Conditional Random Fields: probabilistic models for segmenting and labeling sequence data. In: International conference on machine learning, Williamstown
Lee S, Lee H, Abbeel P, Ng AY (2006) Efficient l1 regularized logistic regression. In: Proceedings of the 21st National Conference on Artificial Intelligence (AAAI)
Mccallum A, Freitag D, Pereira F (2000) Maximum entropy Markov models for information extraction and segmentation. In: International conference on machine learning
Nadas A (1983) A decision theoretic formulation of a training problem in speech recognition and a comparison of training by unconditional versus conditional maximum likelihood. IEEE Trans Acoust Speech Signal Process 31(4):814–817
Article Google Scholar
Ng AY, Jordan M (2002) On discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes. In: Advances in neural information processing systems
Oztoprak F, Nocedal J, Rennie S, Olsen PA (2012) Newton-like methods for sparse inverse covariance estimation. Neural Inf Process Sys
Pernkopf F, Bilmes J (2005) Discriminative versus generative parameter and structure learning of Bayesian network classifiers. In: International conference on machine learning
Rue H, Held L (2005) Gaussian Markov random fields: theory and applications. CRC Press
Samdani R, Chang KW, Roth D (2014) A discriminative latent variable model for online clustering. In: International conference on machine learning
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Article MATH Google Scholar
Sohn KA, Kim S (2012) Joint estimation of structured sparsity and output structure in multiple-output regression via inverse-covariance regularization. Artif Intell Stat
Sutton C, McCallum A (2012) An introduction to conditional random fields. Found Trends Mach Learn 4(4):267–373
Article Google Scholar
Vapnik VN (1998) Statistical learning theory. Wiley-Interscience
Woodland P, Povey D (2000) Large scale discriminative training for speech recognition. In: Proceedings of the workshop on automatic speech recognition
Wytock M, Kolter JZ (2013) Sparse Gaussian conditional random fields: algorithms, theory, and application to energy forecasting. In: International conference on machine learning
Yang E, Lozano A, Ravikumar P (2014) Elementary estimators for sparse covariance matrices and other structured moments. In: International conference on machine learning
Yin J, Li H (2011) A sparse conditional gaussian graphical model for analysis of general genomics data. Ann Appl Stat 5:2630–2650
Article MATH MathSciNet Google Scholar
Yuan X, Zhang T (2014) Partial Gaussian graphical model estimation. IEEE Trans Inf Theory 60:1673–1687
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronics and IT Media Engineering, Seoul National University of Science & Technology, Seoul, 139-743, Korea
Minyoung Kim

Authors

Minyoung Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minyoung Kim.

Additional information

Compliance with Ethical Standards

This work is supported by National Research Foundation of Korea (NRF-2013R1A1A1076101). The author has no conflict of interest. This research does not involve human participants nor animals. Consent to submit this manuscript has been received tacitly from the author’s institution, Seoul National University of Science & Technology.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, M. Sparse inverse covariance learning of conditional Gaussian mixtures for multiple-output regression. Appl Intell 44, 17–29 (2016). https://doi.org/10.1007/s10489-015-0691-9

Download citation

Published: 12 July 2015
Issue Date: January 2016
DOI: https://doi.org/10.1007/s10489-015-0691-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sparse inverse covariance learning of conditional Gaussian mixtures for multiple-output regression

Abstract

Access this article

Similar content being viewed by others

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Partial Least Squares Methods: Partial Least Squares Correlation and Partial Least Square Regression

Overfitting, Model Tuning, and Evaluation of Prediction Performance

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Compliance with Ethical Standards

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Sparse inverse covariance learning of conditional Gaussian mixtures for multiple-output regression

Abstract

Access this article

Similar content being viewed by others

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Partial Least Squares Methods: Partial Least Squares Correlation and Partial Least Square Regression

Overfitting, Model Tuning, and Evaluation of Prediction Performance

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Compliance with Ethical Standards

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation