Skip to main content
Log in

Joint Learning of Unsupervised Dimensionality Reduction and Gaussian Mixture Model

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Dimensionality reduction (DR) has been one central research topic in information theory, pattern recognition, and machine learning. Apparently, the performance of many learning models significantly rely on dimensionality reduction: successful DR can largely improve various approaches in clustering and classification, while inappropriate DR may deteriorate the systems. When applied on high-dimensional data, some existing research approaches often try to reduce the dimensionality first, and then input the reduced features to other available models, e.g., Gaussian mixture model (GMM). Such independent learning could however significantly limit the performance, since the optimal subspace given by a particular DR approach may not be appropriate for the following model. In this paper, we focus on investigating how unsupervised dimensionality reduction could be performed together with GMM and if such joint learning could lead to improvement in comparison with the traditional unsupervised method. In particular, we engage the mixture of factor analyzers with the assumption that a common factor loading exists for all the components. Based on that, we then present EM-algorithm that converges to a local optimal solution. Such setting exactly optimizes a dimensionality reduction together with the parameters of GMM. We describe the framework, detail the algorithm, and conduct a series of experiments to validate the effectiveness of our proposed approach. Specifically, we compare the proposed joint learning approach with two competitive algorithms on one synthetic and six real data sets. Experimental results show that the joint learning significantly outperforms the comparison methods in terms of three criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Hastie T, Tibshirani R (1996) Discriminant analysis by gaussian mixtures. J R Stat Soc B 58:155–176

    MathSciNet  MATH  Google Scholar 

  2. Jolliffe IT (1989) Principal component analysis. Springer, New York

    MATH  Google Scholar 

  3. Duda RO, Hart PE, Stork DG, Duda CRO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Basic Books, New York

    MATH  Google Scholar 

  4. Huang K, King I, Lyu MR (2008) Direct zero-norm optimization for feature selection. In: Proceedings of the 2008 international conference on data mining, pp 845–850

  5. Huang K, Zheng D, Sun J, Hotta Y, Fujimoto K, Naoi S (2010) Sparse learning for support vector classification. Pattern Recognit Lett 31(13):1944–1951

    Article  Google Scholar 

  6. Xu B, Huang K, Liu CL (2012) Maxi-min discriminant analysis via online learning. Neural Netw 34:56–64

    Article  MATH  Google Scholar 

  7. Huang K, Yang H, Lyu MR, King I (2008) Maxi-min margin machine: learning large margin classifiers localy and globally. IEEE Trans Neural Netw 19(2):260–272

    Article  Google Scholar 

  8. Hyvarinen A, Karhunen J, Oja E (2001) Independent component analysis, 1st edn. Wiley, New York

    Book  Google Scholar 

  9. Mclanchlan JG, Peel D, Bean RW (2003) Modelling high-dimensionaldata by mixtures of factor analyzers. Comput Stat Data Anal 41:379–388

    Article  MATH  Google Scholar 

  10. Baek J, McLachlan GJ, Flack LK (2010) Mixtures of factor analyzers with common factor loadings: applications to the clustering and visualization of high-dimensional data. IEEE Trans Pattern Anal Mach Intell 32(7):1298–1309

    Article  Google Scholar 

  11. Figueiredo M, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396

    Article  Google Scholar 

  12. Yang X, Huang K, Zhang R (2014) Unsupervised dimensionality reduction for gaussian mixture model. In: International conference on neural information processing

  13. Huang K, King I, Lyu MR (2003) Discriminative training of bayesian chow-liu multinet classifiers. In: Proceedings of the international joint conference on neural networks, pp 484–488

  14. Huang K, King I, Lyu MR (2003) Finite mixture model of bound semi-naive bayesian network classifier. In: Proceedings of the international conference on artificial neural networks (ICANN-2003), vol 2714, pp 115–122

  15. Tipping ME, Bishop CM (2006) Mixtures of probabilistic principal component analysers. Neural Comput 11(2):443–482

    Article  Google Scholar 

  16. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Ser B 39(1):1–38

    MathSciNet  MATH  Google Scholar 

  17. Asuncion A, Newman D (2007) UCI machine learning repository.http://www.ics.uci.edu/~mlearn/MLRepository.html

  18. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464

    Article  MathSciNet  MATH  Google Scholar 

  19. Hubert L, Arabie P (1985) Comparing partitions. Classification 2:193C218

    Google Scholar 

Download references

Acknowledgments

The research was supported by the National Basic Research Program of China (2012CB316301) and the National Natural Science Foundation of China (No. 61473236).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kaizhu Huang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, X., Huang, K., Goulermas, J.Y. et al. Joint Learning of Unsupervised Dimensionality Reduction and Gaussian Mixture Model. Neural Process Lett 45, 791–806 (2017). https://doi.org/10.1007/s11063-016-9508-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-016-9508-z

Keywords

Navigation