Mixtures of factor analyzers with scale mixtures of fundamental skew normal distributions

Lee, Sharon X.; Lin, Tsung-I; McLachlan, Geoffrey J.

doi:10.1007/s11634-020-00420-9

Mixtures of factor analyzers with scale mixtures of fundamental skew normal distributions

Regular Article
Published: 02 September 2020

Volume 15, pages 481–512, (2021)
Cite this article

Advances in Data Analysis and Classification Aims and scope Submit manuscript

473 Accesses
6 Citations
Explore all metrics

Abstract

Mixtures of factor analyzers (MFA) provide a powerful tool for modelling high-dimensional datasets. In recent years, several generalizations of MFA have been developed where the normality assumption of the factors and/or of the errors were relaxed to allow for skewness in the data. However, due to the form of the adopted component densities, the distribution of the factors/errors in most of these models is typically limited to modelling skewness concentrated in a single direction. Here, we introduce a more flexible finite mixture of factor analyzers based on the class of scale mixtures of canonical fundamental skew normal (SMCFUSN) distributions. This very general class of skew distributions can capture various types of skewness and asymmetry in the data. In particular, the proposed mixtures of SMCFUSN factor analyzers (SMCFUSNFA) can simultaneously accommodate multiple directions of skewness. As such, it encapsulates many commonly used models as special and/or limiting cases, such as models of some versions of skew normal and skew t-factor analyzers, and skew hyperbolic factor analyzers. For illustration, we focus on the t-distribution member of the class of SMCFUSN distributions, leading to mixtures of canonical fundamental skew t-factor analyzers (CFUSTFA). Parameter estimation can be carried out by maximum likelihood via an EM-type algorithm. The usefulness and potential of the proposed model are demonstrated using four real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mixtures of restricted skew-t factor analyzers with common factor loadings

Article 08 March 2018

Mixtures of multivariate restricted skew-normal factor analyzer models in a Bayesian framework

Article 31 January 2019

Mixtures of Hidden Truncation Hyperbolic Factor Analyzers

Article 02 May 2019

References

Arellano-Valle RB, Azzalini A (2006) On the unification of families of skew-normal distributions. Scand J Stat 33:561–574
MathSciNet MATH Google Scholar
Arellano-Valle RB, Genton MG (2005) On fundamental skew distributions. J Multivar Anal 96:93–116
MathSciNet MATH Google Scholar
Azzalini A, Capitanio A (2014) The Skew-Normal and Related Families. Cambridge University Press, Cambridge
MATH Google Scholar
Azzalini A, Dalla Valle A (1996) The multivariate skew-normal distribution. Biometrika 83:715–726
MathSciNet MATH Google Scholar
Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22:719–725
Google Scholar
Browne RP, McNicholas PD (2015) A mixture of generalized hyperbolic distributions. Can J Stat 43:176–198
MathSciNet MATH Google Scholar
Cabral CRB, Lachos VH, Prates MO (2012) Multivariate mixture modeling using skew-normal independent distributions. Comput Stat Data Anal 56:126–142
MathSciNet MATH Google Scholar
Codella N, Gutman D, Celebi ME, Helba B, Marchetti MA, Dusza S, Kalloo A, Liopyris K, Mishra N, Kittler H, Halpern A (2017) Skin lesion analysis toward melanoma detection: A challenge at the 2017 In: International Symposium on Biomedical Imaging (ISBI), hosted by the International Skin Imaging Collaboration (ISIC). arXiv:1710.05006
Cook RD, Weisberg S (1994) An Introduction to Regression Graphics. Wiley, New York
MATH Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J Royal Stat Soc B 39:1–38
MathSciNet MATH Google Scholar
Ferris LK, Harkes JA, Gilbert B, Winger DG, Golubets K, Akilov O, Satyanarayanan M (2015) Computer-aided classification of melanocytic lesions using dermoscopic images. J Am Acad Dermatol 73:769–776
Google Scholar
Forina M, Tiscornia E (1982) Pattern recognition methods in the prediction of italian olive oil origin by their fatty acid content. Annali di Chimica 72:143–155
Google Scholar
Genton MG (ed) (2004) Skew-Elliptical Distributions and Their Applications: A Journey Beyond Normality. Chapman & Hall, CRC, Boca Raton, Florida
Ghahramani Z, Hinton G (1997) The EM algorithm for factor analyzers. Technical Report No CRG-TR-96-1 The University of Toronto: Toronto
Ho HJ, Lin TI, Chen HY, Wang WL (2012) Some results on the truncated multivariate $t$ distribution. J Stat Plan Inference 142:25–40
MathSciNet MATH Google Scholar
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218
MATH Google Scholar
Karlis D, Santourian A (2009) Model-based clustering with non-elliptically contoured distributions. Stat Comput 19:73–83
MathSciNet Google Scholar
Kim HM, Maadooliat M, Arellano-Valle RB, Genton MG (2016) Skewed factor models using selection mechanisms. J Multivar Anal 145:162–177
MathSciNet MATH Google Scholar
Kim SG (2016) An approximate fitting for mixture of multivariate skew normal distribution via EM algorithm. Korean J Appl Stat 29:513–523
Google Scholar
Lee S, McLachlan GJ (2014) Finite mixtures of multivariate skew $t$-distributions: Some recent and new results. Stat Comput 24:181–202
MathSciNet MATH Google Scholar
Lee SX, McLachlan GJ (2013) On mixtures of skew-normal and skew $t$-distributions. Adv Data Anal Classif 7:241–266
MathSciNet MATH Google Scholar
Lee SX, McLachlan GJ (2016) Finite mixtures of canonical fundamental skew $t$-distributions: The unification of the restricted and unrestricted skew $t$-mixture models. Stat Comput 26:573–589
MathSciNet MATH Google Scholar
Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml
Lin TI (2009) Maximum likelihood estimation for multivariate skew normal mixture models. J Multivar Anal 100:257–265
MathSciNet MATH Google Scholar
Lin TI (2010) Robust mixture modeling using multivariate skew-$t$ distribution. Stat Comput 20:343–356
MathSciNet Google Scholar
Lin TI, Wu PH, McLachlan GJ, Lee SX (2015) A robust factor analysis model using the restricted skew $t$-distribution. TEST 24:510–531
MathSciNet MATH Google Scholar
Lin TI, McLachlan GJ, Lee SX (2016) Extending mixtures of factor models using the restricted multivariate skew-normal distribution. J Multivar Anal 143:398–413
MathSciNet MATH Google Scholar
Lin TI, Wang WL, McLachlan GJ, Lee SX (2018) Robust mixtures of factor analysis models using the restricted multivariate skew-$t$ distribution. Stat Modell 18:50–72
MathSciNet MATH Google Scholar
Maleki M, Wraith D, Arellano-Valle RB (2019) Robust finite mixture modeling of multivariate unrestricted skew-normal generalized hyperbolic distributions. Stat Comput 29:425–428
MathSciNet MATH Google Scholar
Maruotti A, Bulla J, Lagona F, Picone M, Martella F (2017) Dynamic mixtures of factor analyzers to characterize multivariate air pollutant exposures. Ann Appl Stat 3:1617–1648
MathSciNet MATH Google Scholar
McLachlan GJ, Krishnan T (2008) The EM Algorithm and Extensions, 2nd edn. Wiley, Hoboken, New Jersey
MATH Google Scholar
McLachlan GJ, Lee SX (2016) Comment on “On nomenclature for, and the relative merits of, two formulations of skew distributions” by A. Azzalini, R. Browne, M. Genton, and P. McNicholas Stat Probab Lett 116:1–5
MathSciNet MATH Google Scholar
McLachlan GJ, Peel D (2000) Finite Mixture Models. Wiley, New York
MATH Google Scholar
McLachlan GJ, Peel D, Bean RW (2003) Modelling high-dimensional data by mixtures of factor analyzers. Comput Stat Data Anal 41:379–388
MathSciNet MATH Google Scholar
McLachlan GJ, Bean RW, Jones BT (2007) Extension of the mixture of factor analyzers model to incorporate the multivariate $t$-distribution. Comput Stat Data Anal 51:5327–5338
MathSciNet MATH Google Scholar
Meng X, Rubin D (1993) Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80:267–278
MathSciNet MATH Google Scholar
Montanari A, Viroli C (2010) A skew-normal factor model for the analysis of student satisfaction towards university courses. J Appl Stat 37:463–487
MathSciNet MATH Google Scholar
Murray P, Browne R, McNicholas P (2014a) Mixtures of skew-$t$ factor analyzers. Comput Stat Data Anal 77:326–335
MathSciNet MATH Google Scholar
Murray P, McNicholas P, Browne R (2014b) Mixtures of common skew-$t$ factor analyzers. Statistics 3:68–82
MATH Google Scholar
Murray PM (2016) Detecting non-elliptical clusters. PhD thesis, Department of Mathematics & Statistics, McMaster University, Canada
Murray PM, Browne RP, McNicholas PD (2017a) Hidden truncation hyperbolic distributions, finite mixtures thereof, and their application for clustering. J Multivar Anal 161:141–156
MathSciNet MATH Google Scholar
Murray PM, Browne RP, McNicholas PD (2017b) A mixture of SDB skew-$t$ factor analyzers. Econom Stat 3:160–168
MathSciNet Google Scholar
Murray PM, Browne RP, McNicholas PD (2017c) Mixtures of hidden truncation hyperbolic factor analyzers. arXiv:1711.01504
O’Hagan A (1976) Moments of the truncated multivariate-$t$ distribution. http://www.tonyohagan.co.uk/academic/pdf/trunc_multi_t.PDF
Pyne S, Hu X, Wang K, Rossin E, Lin TI, Maier LM, Baecher-Allan C, McLachlan GJ, Tamayo P, Hafler DA, De Jager PL, Mesirow JP (2009) Automated high-dimensional flow cytometric data analysis. Proc National Acad Sci USA 106:8519–8524
Google Scholar
R Core Team (2016) R: A Language and Environment for Statistical Computing. http://www.R-project.org/, R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0
Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850
Google Scholar
Sahu SK, Dey DK, Branco MD (2003) A new class of multivariate skew distributions with applications to Bayesian regression models. Can J Stat 31:129–150
MathSciNet MATH Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
MathSciNet MATH Google Scholar
Seshadri V (1997) Halphen’s laws. In: Kotz S, Read CB, Banks DL (eds) Encyclopedia of Statistical Sciences. Wiley, New York, pp 302–306
Google Scholar
Tortora C, Browne RP, Franczak BC, McNicholas PD (2015) MixGHD: Model Based Clustering, Classification and Discriminant Analysis Using the Mixture of Generalized Hyperbolic Distributions. http://cran.r-project.org/web/packages/MixGHD, r package version 1.7
Tortora C, McNicholas P, Browne R (2016) A mixture of generalized hyperbolic factor analyzers. Adv Data Anal Classif 10:423–440
MathSciNet MATH Google Scholar
Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. J Mach Learn Res 11:2227–2240
MathSciNet MATH Google Scholar
Wall MM, Guo J, Amemiya Y (2012) Mixture factor analysis for approximating a non-normally distributed continuous latent factor with continuous and dichotomous observed variables. Multivar Behav Res 47:276–313
Google Scholar
Yamamoto H, Nankaku Y, Miyajima C, Tokuda K, Kitamura T (2005) Parameter sharing in mixture of factor analyzers for speaker identification. IEICE Trans Inf Syst 88:418–424
Google Scholar
Zhoe YK, Mobasher B (2006) Web user segmentation based on a mixture of factor analyzers. Lect Notes Comput Sci 4082:11–20
Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematical Science, University of Adelaide, Adelaide, South Australia, 5005, Australia
Sharon X. Lee
Institute of Statistics, National Chung Hsing University, Taichung, Taiwan
Tsung-I Lin
Department of Public Health, China Medical University, Taichung, Taiwan
Tsung-I Lin
School of Mathematics and Physics, University of Queensland, Saint Lucia, 4072, Australia
Geoffrey J. McLachlan

Authors

Sharon X. Lee
View author publications
You can also search for this author in PubMed Google Scholar
Tsung-I Lin
View author publications
You can also search for this author in PubMed Google Scholar
Geoffrey J. McLachlan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Geoffrey J. McLachlan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: The class of CFUSS distributions

The class of canonical fundamental skew symmetric (CFUSS) distributions (Arellano-Valle and Genton 2005) is one of the more general formulations of skew distributions. We begin by examining the fundamental skew distribution. Its density can be expressed as the product of a symmetric density and a skewing function. Formally, the density of ${\varvec{Y}}$, a p-dimensional random vector following a CFUSS distribution, is given by

$$\begin{aligned} f({\varvec{y}}; \varvec{\theta })= & {} 2^{r} f_p({\varvec{y}}; \varvec{\theta }) \, Q_r({\varvec{y}}; \varvec{\theta }), \end{aligned}$$

(33)

where $f_p({\varvec{y}}; \varvec{\theta })$ is a symmetric density on $\mathbb {R}^p$, $Q_r({\varvec{y}}; \varvec{\theta })$ is a skewing function that maps ${\varvec{y}}$ into the unit interval, and $\varvec{\theta }$ is the vector containing the parameters of ${\varvec{Y}}$. Let ${\varvec{U}}$ be a $r\times 1$ random vector, where ${\varvec{Y}}$ and ${\varvec{U}}$ follow a joint distribution such that ${\varvec{Y}}$ has marginal density $f_p({\varvec{y}}; \varvec{\theta })$ and $Q_r({\varvec{y}}; \varvec{\theta }) = P({\varvec{U}}> \mathbf{0}\mid {\varvec{Y}}= {\varvec{y}})$. If the latent random vector ${\varvec{U}}$ has its canonical distribution (that is, with location vector $\mathbf{0}$ and scale matrix ${\varvec{I}}_r$), we obtain the canonical form of (33), namely the CFUSS distribution. The class of CFUSS distributions encapsulates many existing distributions, including most of those mentioned earlier in this paper. We shall consider some particular cases of the class of CFUSS distributions here.

1.1 A.1 The CFUSN distribution

The skew normal member of the class of CFUSS distributions is the canonical fundamental skew normal (CFUSN) distribution. This can be obtained by taking $f_p$ to be a normal density, leading to $Q_r$ being a normal cdf. It follows that the density of the CFUSN distribution is given by

$$\begin{aligned} f_{{\mathrm{CFUSN}}}({\varvec{y}}; \varvec{\mu }, \varvec{\Sigma }, \varvec{\Delta })= & {} 2^{r} \phi _p({\varvec{y}}; \varvec{\mu }, \varvec{\Omega }) \; \Phi _r\left( \varvec{\Delta }^T\varvec{\Omega }^{-1}({\varvec{y}}-\varvec{\mu }); \mathbf{0}, \varvec{\Lambda }\right) , \end{aligned}$$

(34)

where $\varvec{\Omega }= \varvec{\Sigma }+ \varvec{\Delta }\varvec{\Delta }^T$ and $\varvec{\Lambda }= {\varvec{I}}_r - \varvec{\Delta }^T \varvec{\Omega }^{-1} \varvec{\Delta }$. In the above, $\varvec{\mu }$ is a $p\times 1$ vector of location parameters, $\varvec{\Sigma }$ is a $p\times p$ positive definite scale matrix, and $\varvec{\Delta }$ is a $p\times r$ matrix of skewness parameters. We shall adopt the notation ${\varvec{Y}}\sim \hbox {CFUSN}_{p,r}(\varvec{\mu }, \varvec{\Sigma }, \varvec{\Delta })$ if ${\varvec{Y}}$ has the density given by (34). Note that when $\varvec{\Delta }= \mathbf{0}$, we obtain the (multivariate) normal distribution. In addition, a number of skew normal distributions are nested within the CFUSN distribution, including the version proposed by Azzalini and Dalla Valle (1996) and the version proposed by Sahu et al. (2003). We shall follow the terminology of Lee and McLachlan (2013) and refer to them as the restricted and unrestricted skew normal distribution, respectively.

It is of interest to note that ${\varvec{Y}}$ admits a convolution-type stochastic representation that facilitates the derivation of properties and parameter estimation via the EM algorithm. This is given by

$$\begin{aligned} {\varvec{Y}}= & {} \varvec{\mu }+ \varvec{\Delta }|{\varvec{U}}| + {\varvec{e}}, \end{aligned}$$

(35)

where ${\varvec{U}}$ follows a standard r-dimensional normal distribution, independently of ${\varvec{e}}\sim N_p(\mathbf{0}, \varvec{\Sigma })$. Hence, $|{\varvec{U}}|$ has a standard half-normal distribution.

1.2 A.2 Scale mixture of CFUSN distributions

In the next two subsections, we shall consider two skew distributions that were recently employed by Lee and McLachlan (2016) and Murray et al. (2017a) for their mixture models, namely the CFUST and HTH distributions, respectively. They are special cases of the class of the CFUSS distributions that can be obtained as a scale mixture of the CFUSN (SMCFUSN) distribution. By a normal scale mixture, we mean a distribution that can be defined by the stochastic representation

$$\begin{aligned} {\varvec{Y}}= & {} \varvec{\mu }+ W^{\frac{1}{2}} {\varvec{Y}}_0, \end{aligned}$$

(36)

where ${\varvec{Y}}_0$ follows a central CFUSN distribution and W is a positive (univariate) random variable independent of ${\varvec{Y}}_0$. Thus, conditional on $W = w$, the density of ${\varvec{Y}}$ is a CFUSN distribution with scale matrix ${w}\varvec{\Sigma }$. It follows that the marginal density of ${\varvec{Y}}$ is given by (1), or equivalently,

$$\begin{aligned}&f_{{\mathrm{SMCFUSN}}} ({\varvec{y}}; \varvec{\mu }, \varvec{\Sigma }, \varvec{\Delta }; F_{{\varvec{\zeta }}}) \nonumber \\& = 2^{r} \int _0^\infty \phi _p \left( {\varvec{y}}; \varvec{\mu }, {w}\varvec{\Omega }\right) \, \Phi _r\left( \frac{1}{\sqrt{w}}\varvec{\Delta }^T\varvec{\Omega }^{-1}({\varvec{y}}-\varvec{\mu }); \mathbf{0}, \varvec{\Lambda }\right) dF_{{\varvec{\zeta }}}(w), \end{aligned}$$

(37)

where $F_{{\varvec{\zeta }}}$ is defined in Sect. 2.2.

The class of SMCFUSN distributions is a generalization of the scale mixture of skew normal (SMSS) distributions considered by Cabral et al. (2012). The latter adopts a restricted skew normal distribution in place of the CFUSN distribution here. This class can be obtained from the SMCFUSN distribution by taking $r=1$ (after reparameterization). Some special cases of the SMCFUSN distribution are listed in Table 10.

Table 10 Some special cases of the scale mixture of CFUSN distributions

Full size table

1.3 A.3 The CFUSH distribution

If the latent variable W in (36) follows a generalized inverse Gaussian (GIG) distribution (Seshadri 1997), we obtain the canonical fundamental skew hyperbolic (CFUSH) distribution. In this case, the symmetric density $f_p$ in (33) is a symmetric GH distribution $h_p(\cdot )$ and the skewing function becomes the cdf of a symmetric GH distribution $H_r(\cdot )$. The GIG density can be expressed as

$$\begin{aligned} f_{{\mathrm{GIG}}} (w; \psi , \chi , \lambda )= & {} \frac{\left( \frac{\psi }{\chi }\right) ^{\frac{\lambda }{2}} w^{\lambda -1}}{2 K_\lambda (\sqrt{\chi \psi })} e^{-\frac{\psi w + \frac{\chi }{w}}{2}}, \end{aligned}$$

(38)

where $W > 0$, the parameters $\psi $ and $\chi $ are positive, and $\lambda $ is a real parameter. In the above, $K_\lambda (\cdot )$ denotes the modified Bessel function of the third kind of order $\lambda $. The density of a p-dimensional symmetric generalized hyperbolic distribution is given by

$$\begin{aligned} h_p({\varvec{y}}; \varvec{\mu }, \varvec{\Sigma },\varvec{\psi },\chi ,\lambda )= & {} \left( \frac{\chi +\eta }{\psi }\right) ^{\frac{\lambda }{2}-\frac{p}{4}} \frac{\left( \frac{\psi }{\chi }\right) ^{\frac{\lambda }{2}} K_{\lambda -\frac{p}{2}}(\sqrt{(\chi +\eta )\psi })}{(2\pi )^{\frac{p}{2}} |\varvec{\Sigma }|^{\frac{1}{2}} K_\lambda (\sqrt{\chi \psi })}. \end{aligned}$$

(39)

It is well known that the GH distribution has an identifiability issue in that the parameter vectors $\varvec{\theta }=(\varvec{\mu }, c\varvec{\Sigma }, c\psi , \chi /c, \lambda )$ and $\varvec{\theta }^*=(\varvec{\mu }, \varvec{\Sigma }, \psi , \chi , \lambda )$ both yield the same symmetric GH distribution (39) for any $c>0$. It is therefore not surprising that the CFUSH distribution also suffers from such an issue. To handle this, restrictions are imposed on some of the parameters of the CFUSH distribution. An example is the HTH distribution considered by Murray et al. (2017a), where the constraint $\psi =\chi =\omega $ is used, leading to the density

$$\begin{aligned}&f_{{\mathrm{HTH}}} ({\varvec{y}}; \varvec{\mu }, \varvec{\Sigma }, \varvec{\Delta }, \omega , \lambda ) \nonumber \\&\quad = 2^r h_p\left( {\varvec{y}}; \varvec{\mu }, \varvec{\Omega }, \omega , \omega , \lambda \right) H_r\left( \varvec{\Delta }^T\varvec{\Omega }^{-1}({\varvec{y}}-\varvec{\mu }) \left( \frac{\omega }{\omega +\eta }\right) ^{\frac{1}{4}}; \mathbf{0}, \varvec{\Lambda }, \lambda -\textstyle \frac{p}{2}, \gamma , \gamma \right) ,\nonumber \\ \end{aligned}$$

(40)

where $\gamma = \sqrt{\psi (\omega +\eta )}$. Note that in their terminology, they are using ‘hidden truncation’ to describe the latent skewing variable that follows a truncated distribution in the convolution-type characterization of the CFUSH distribution. Another alternative is to restrict the parameters of W so that, for example, $E(W)=1$. A commonly used constraint on the GH distribution is to set $|\varvec{\Sigma }|=1$. This can be applied to the CFUSH distribution to achieve identifiability; see also the unrestricted skew normal generalized hyperbolic (SUNGH) distribution considered by Maleki et al. (2019).

1.4 A.4 The CFUST distribution

The CFUST distribution is the skew t-distribution member of the class of CFUSS distributions, where the symmetric distribution is taken to be a (multivariate) t-distribution. This can be obtained by letting $\frac{1}{W}$ be a random variable that has a $\hbox {gamma}(\frac{\nu }{2}, \frac{\nu }{2})$ distribution. Thus, its density is given by

$$\begin{aligned}&f_{{\mathrm{CFUST}}}({\varvec{y}}; \varvec{\mu }, \varvec{\Sigma }, \varvec{\Delta }, \nu ) \nonumber \\&\quad = 2^r t_p({\varvec{y}}; \varvec{\mu }, \varvec{\Omega }, \nu ) T_r\left( \varvec{\Delta }^T\varvec{\Omega }^{-1}({\varvec{y}}-\varvec{\mu }); \mathbf{0}, \left( \frac{\nu +\eta }{\nu +p}\right) \varvec{\Lambda }, \nu +p\right) . \end{aligned}$$

(41)

We shall adopt the notation ${\varvec{Y}}\sim CFUST_{p,r}(\varvec{\mu }, \varvec{\Sigma }, \varvec{\Delta }, \nu )$ if ${\varvec{Y}}$ has the density given by (41).

The CFUST distribution can be represented by a number of stochastic representations, including the convolution of a half t-random vector $|{\varvec{U}}|$ and a t-random vector ${\varvec{e}}$, given by

$$\begin{aligned} {\varvec{Y}}= & {} \varvec{\mu }+ \varvec{\Delta }|{\varvec{U}}| + {\varvec{e}}, \end{aligned}$$

(42)

where ${\varvec{U}}$ and ${\varvec{e}}$ have a joint t-distribution given by

$$\begin{aligned} \left[ \begin{array}{c} {\varvec{U}}\\ {\varvec{e}}\end{array}\right]\sim & {} t_{r+p} \left( \left[ \begin{array}{c} \mathbf{0}\\ \mathbf{0}\end{array}\right] , \left[ \begin{array}{cc} {\varvec{I}}_r &{} \quad \mathbf{0}\\ \mathbf{0}&{} \quad \varvec{\Sigma }\end{array}\right] , \nu \right) . \end{aligned}$$

From (42), we can obtain the mean and covariance matrix ${\varvec{X}}$, which are given by

$$\begin{aligned} E({\varvec{Y}})= & {} \varvec{\mu }+ a(\nu ) \varvec{\Delta }{\varvec{1}}_r \end{aligned}$$

and

$$\begin{aligned} \hbox {cov}({\varvec{Y}})= & {} \left( \frac{\nu }{\nu -2}\right) \left[ \varvec{\Sigma }+ \left( 1-\frac{2}{\pi }\right) \varvec{\Delta }\varvec{\Delta }^T\right] + \left[ \frac{2\nu }{\pi (\nu -2)} + a(\nu )^2\right] \varvec{\Delta }{\varvec{J}}_r \varvec{\Delta }^T, \end{aligned}$$

where $a(\nu ) = \sqrt{\frac{\nu }{2}} \Gamma (\frac{\nu -1}{2}) \left[ \Gamma (\frac{\nu }{2})\right] ^{-1}$.

In addition to the CFUSN distribution (and its nested special/limiting cases), the CFUST distribution embeds a number of commonly used distributions as special or limiting cases. This includes the unrestricted t-distribution by Sahu et al. (2003) (obtained by taking $\varvec{\Delta }$ to be a diagonal $p\times p$ matrix, and letting $\nu \rightarrow \infty $ for the skew normal case), the restricted skew t-distributions (obtained by setting $r=1$), and the t-distribution (obtained by setting $\varvec{\Delta }=0$). Concerning the identifiability of the CFUST model, it can be observed from (42) that it bears a resemblance to the FA model (2). Indeed, it can be viewed as a FA model with latent factors following a half t-distribution and the skewness matrix acting as the factor loading matrix. However, unlike the FA model, the term $\varvec{\Delta }|{\varvec{U}}|$ in the CFUST distribution is not rotational invariant. However, it is invariant to permutations of the columns of $\varvec{\Delta }$, but this does not affect the number of free parameters in the CFUST model.

Appendix B: Expressions for the E-step of the ECM algorithm for the CFUSTFA model

For the CFUSTFA model, the E-step of the ECM algorithms involves four conditional expressions that are analogous to the case of mixtures of CFUST distributions. Technical details can be found in Lee and McLachlan (2016). The expressions for (19) to (22) are similar to that for (12), (13), (15), and (16), respectively, in Lee and McLachlan (2016). However, the scale matrices and skewness matrices in our case are given by $\varvec{\Sigma }_i^{*^{(k)}} = {\varvec{B}}_i^{(k)}{\varvec{B}}_i^{(k)}+{\varvec{D}}_i^{(k)}$ and $\varvec{\Delta }_i^{*^{(k)}} = {\varvec{B}}_i^{(k)} \varvec{\Delta }_i^{(k)}$ $(i=1, \ldots , g)$, respectively. Thus, the expressions for the conditional expectations (19) to (22) are given by

$$\begin{aligned} z_{ij}^{(k)}= & {} \frac{\pi _i^{(k)} f_{{\mathrm{CFUST}}_{p,r}} ({\varvec{y}}_j; \varvec{\mu }_i^{(k)}, \varvec{\Sigma }_i^{*^{(k)}}, \varvec{\Delta }_i^{*^{(k)}}, \nu _i^{(k)})}{\sum _{i=1}^g \pi _i^{(k)} f_{{\mathrm{CFUST}}_{p,r}} ({\varvec{y}}_j; \varvec{\mu }_i^{(k)}, \varvec{\Sigma }_i^{*^{(k)}}, \varvec{\Delta }_i^{*^{(k)}}, \nu _i^{(k)})}, \end{aligned}$$

(43)

$$\begin{aligned} w_{ij}^{(k)}= & {} \left( \frac{\nu _i^{(k)} + p}{\nu _i^{(k)} + d_{ij}^{(k)}}\right) \frac{T_r\left( {\varvec{c}}_{ij}^{(k)} \sqrt{\frac{\nu _i^{(k)}+p+2}{\nu _i+d_{ij}^{(k)}}}; \mathbf{0}, \varvec{\Lambda }_i^{(k)}, \nu _i^{(k)}+p+2\right) }{T_r\left( {\varvec{c}}_{ij}^{(k)} \sqrt{\frac{\nu _i^{(k)}+p}{\nu _i+d_{ij}^{(k)}}}; \mathbf{0}, \varvec{\Lambda }_i^{(k)}, \nu _i^{(k)}+p\right) }, \end{aligned}$$

(44)

$$\begin{aligned} {\varvec{e}}_{1ij}^{(k)}= & {} w_{ij}^{(k)} E\left[ {\varvec{a}}_{ij}^{(k)}\right] , \end{aligned}$$

(45)

$$\begin{aligned} {\varvec{e}}_{2ij}^{{(k)}}= & {} w_{ij}^{(k)} E\left[ {\varvec{a}}_{ij}^{(k)} {\varvec{a}}_{ij}^{(k)^T} \right] , \end{aligned}$$

(46)

where

$$\begin{aligned} d_{ij}^{(k)}= & {} ({\varvec{y}}_j - \varvec{\mu }_i^{(k)})^T \varvec{\Omega }_i^{(k)^{-1}} ({\varvec{y}}_i - \varvec{\mu }_i^{(k)}),\\ {\varvec{c}}_{ij}^{(k)}= & {} \varvec{\Delta }_i^{*^{(k)^T}} \varvec{\Omega }_i^{(k)^{-1}} ({\varvec{y}}_j-\varvec{\mu }_i^{(k)}),\\ \varvec{\Lambda }_i^{(k)}= & {} {\varvec{I}}_r - \varvec{\Delta }_i^{*^{(k)^T}} \varvec{\Omega }_i^{(k)^{-1}} \varvec{\Delta }_i^{(k)},\\ \varvec{\Omega }_i^{(k)}= & {} \varvec{\Sigma }_i^{*^{(k)}} + \varvec{\Delta }_i^{*^{(k)}} \varvec{\Delta }_i^{*^{(k)^T}}, \end{aligned}$$

and where ${\varvec{a}}_{ij}^{(k)}$ is a r-variate truncated t-random variable given by

$$\begin{aligned} {\varvec{a}}_{ij}^{(k)}\sim & {} Tt_r\left( {\varvec{c}}_{ij}^{(k)}, \left( \frac{\nu _i^{(k)} + d_{ij}^{(k)}}{\nu _i^{(k)}+p+2}\right) \varvec{\Lambda }_i^{(k)}, \nu _i^{(k)}+p+2; \mathbb {R}^+\right) . \end{aligned}$$

The last term in expressions (45) and (46) correspond to the first and second moments of ${\varvec{a}}_{ij}^{(k)}$ and can be evaluated using formulae described in, for example, O’Hagan (1976), Ho et al. (2012), and in the appendix of Lee and McLachlan (2014).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, S.X., Lin, TI. & McLachlan, G.J. Mixtures of factor analyzers with scale mixtures of fundamental skew normal distributions. Adv Data Anal Classif 15, 481–512 (2021). https://doi.org/10.1007/s11634-020-00420-9

Download citation

Received: 25 October 2018
Revised: 17 March 2020
Accepted: 24 August 2020
Published: 02 September 2020
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11634-020-00420-9

Keywords

Mathematics Subject Classification

62H30

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mixtures of factor analyzers with scale mixtures of fundamental skew normal distributions

Abstract

Access this article

Similar content being viewed by others

Mixtures of restricted skew-t factor analyzers with common factor loadings

Mixtures of multivariate restricted skew-normal factor analyzer models in a Bayesian framework

Mixtures of Hidden Truncation Hyperbolic Factor Analyzers

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: The class of CFUSS distributions

1.1 A.1 The CFUSN distribution

1.2 A.2 Scale mixture of CFUSN distributions

1.3 A.3 The CFUSH distribution

1.4 A.4 The CFUST distribution

Appendix B: Expressions for the E-step of the ECM algorithm for the CFUSTFA model

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Mixtures of factor analyzers with scale mixtures of fundamental skew normal distributions

Abstract

Access this article

Similar content being viewed by others

Mixtures of restricted skew-t factor analyzers with common factor loadings

Mixtures of multivariate restricted skew-normal factor analyzer models in a Bayesian framework

Mixtures of Hidden Truncation Hyperbolic Factor Analyzers

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: The class of CFUSS distributions

1.1 A.1 The CFUSN distribution

1.2 A.2 Scale mixture of CFUSN distributions

1.3 A.3 The CFUSH distribution

1.4 A.4 The CFUST distribution

Appendix B: Expressions for the E-step of the ECM algorithm for the CFUSTFA model

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation