Robust mixture modeling based on scale mixtures of skew-normal distributions
Introduction
Finite mixtures of distributions, that is, convex linear combination of densities (known as the mixture components), have been widely used as a powerful tool to model heterogeneous data and to approximate complicated probability densities, presenting multimodality, skewness and heavy tails. These models have been applied in several areas like genetics, image processing, medicine and economics. Comprehensive surveys are available in Böhning (2000), McLachlan and Peel (2000) and, from a Bayesian point of view, in Frühwirth-Schnatter (2006).
The literature on maximum likelihood estimation of the parameters of the normal and Student-t mixture models–hereafter the FM-NOR and the FM-T models, respectively–is very extensive; see McLachlan and Peel (2000) and the references herein, Peel and McLachlan (2000), Nityasuddhi and Böhning (2003), Biernacki et al. (2003) and Dias and Wedel (2004), for example. The standard algorithm in this case is the so-called EM (Expectation-Maximization) of Dempster et al. (1977), or maybe some extension like the ECM (Meng and Rubin, 1993) or the ECME (Liu and Rubin, 1994) algorithms. For a good review, including applications in finite mixture models, see McLachlan and Krishnan (2008).
It is well known that robustness is achieved by modeling the outlier using the Student-t distribution. Finite mixtures of these distributions are useful when there is, besides discrepant observations, unobserved heterogeneity. Here, we suggest a class of models to deal with extra skewness, extending the work of Lin et al. (2007b) and Lin et al. (2007a), where finite mixtures of skew-normal (Azzalini, 1985, SN) and skew-Student-t (Azzalini and Capitanio, 2003, ST) distributions are investigated, respectively. The mixture components distributions are assumed to follow a flexible class of scale mixtures of skew-normal distributions — hereafter SMSN, presented by Branco and Dey (2001). This class contains the entire family of scale mixtures of normal distributions (Andrews and Mallows, 1974). In addition, the skew-normal and skewed versions of some other classical symmetric distributions are SMSN members: the skew-t, the skew-slash (SSL) and the skew contaminated normal (SCN), for example. These distributions have heavier tails than the skew-normal (and the normal) one, and thus they seem to be a reasonable choice for robust inference.
The remainder of the paper is organized as follows. In Section 2, for the sake of completeness, we present some properties of the univariate SMSN family and the related EM-type algorithm for maximum likelihood estimation. In Section 3 we propose a finite mixture of scale mixtures of skew-normal distributions (FM-SMSN) and an EM-type algorithm for maximum likelihood estimation. The associated observed information matrix is obtained analytically in Section 4. In Section 5 we present a simulation study to show that the proposed models are robust in terms of clustering heterogeneous data and that the maximum likelihood estimates based on the EM-type algorithm do provide good asymptotic properties. Additionally, we report some model selection criteria via simulation. The methodology proposed is illustrated in Section 6, considering the analysis of a real data set.
Section snippets
Preliminaries
First, we make some remarks about the class of scale mixtures of skew-normal distributions, as introduced by Branco and Dey (2001); see also Arellano-Valle et al. (2006).
As defined by Azzalini (1985), a random variable has skew-normal distribution with location parameter , scale parameter and skewness parameter , if its density is given by where denotes the density of the univariate normal distribution with mean and variance and is the
The model
The finite mixture of SMSN distributions model (FM-SMSN) is defined by considering a random sample from a -component mixture of SMSN densities given by where is the specific vector of parameters for the component , is the SMSN density, are the mixing probabilities and is the vector with all parameters. Concerning the parameter of the mixing distribution
The observed information matrix
In this section we obtain the observed information matrix of the FM-SMSN model, defined as It is well known that, under some regularity conditions, the covariance matrix of the maximum likelihood estimates can be approximated by the inverse of . Following Basford et al. (1997) and Lin et al. (2007b), we evaluate where We consider now the vector which is partitioned into components corresponding to
Simulation study
In order to examine the performance of the proposed method, we present some simulation studies. The first simulation study shows that the underlying FM-SMSN models are robust in the ability to cluster heterogeneous data. The second simulation study shows that our proposed ECME algorithm estimates do provide good asymptotic properties. In the third study we compare some model selection criteria.
Application — The BMI data
As an application of the methodology proposed in this work, we consider the body mass index for men aged between 18 to 80 years. The data set comes from the National Health and Nutrition Examination Survey, made by the National Center for Health Statistics (NCHS) of the Center for Disease Control (CDC) in the USA. The problem of obesity has attracted attention in the last few years due to its strong relationship with many chronic diseases. Body mass index (BMI, kg/m2) has become the standard
Final conclusion
In this work we have proposed a robust approach to finite mixture modeling based on scale mixtures of skew-normal distributions. Our proposed model generalizes the recent works of Lin et al., 2007a, Lin et al., 2007b. This generalized robust model simultaneously accommodates multimodality, asymmetry and heavy tails, thus allowing practitioners from different areas to analyze data in an extremely flexible way. An ECME algorithm is developed by exploring the statistical properties of the class
Acknowledgments
The authors would like to thank the Associate Editor and two anonymous referees for their useful comments which substantially improved the quality of this paper. The second author acknowledges the partial financial support from Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) and CNPq-Brazil. The third author acknowledges the partial financial support from CNPq and CAPES-Brazil.
References (36)
- et al.
Choosing starting values for the em algorithm for getting the highest likelihood in multivariate gaussian mixture models
Computational Statistics & Data Analysis
(2003) - et al.
A general class of multivariate skew-elliptical distributions
Journal of Multivariate Analysis
(2001) Maximum likelihood estimation for multivariate skew normal mixture models
Journal of Multivariate Analysis
(2009)- et al.
Asymptotic properties of the EM algorithm estimate for normal mixture models with component specific variances
Computational Statistics & Data Analysis
(2003) - et al.
The multivariate skew-slash distribution
Journal of Statistical Planning and Inference
(2006) A new look at the statistical model identification
IEEE Transactions on Automatic Control
(1974)- et al.
Scale mixtures of normal distributions
Journal of the Royal Statistical Society, Series B
(1974) - et al.
A unified view on skewed distributions arising from selections
Canadian Journal of Statistics
(2006) - et al.
The nontruncated marginal of a truncated bivariate normal distribution
Psychometrika
(1993) A class of distributions which includes the normal ones
Scandinavian Journal of Statistics
(1985)
The skew-normal distribution and related multivariate families
Scandinavian Journal of Statistics
Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution
Journal of the Royal Statistical Society, Series B
On rates of convergence of efficient detection criteria in signalprocessing with white noise
IEEE Transactions on Information Theory
Standard errors of fitted component means of normal mixtures
Computational Statistics
Assessing a mixture model for clustering with the integratedcompleted likelihood
IEEE Transactions on Pattern Analysis and Machine Intelligence
Computer-Assisted Analysis of Mixtures and Applications
Maximum likelihood from incomplete data via the EM algorithm
Journal of the Royal Statistical Society, Series B
An empirical comparison of EM, SEM and MCMC performance for problematic gaussian mixture likelihoods
Statistics and Computing
Cited by (122)
Clustering asymmetrical data with outliers: Parsimonious mixtures of contaminated mean-mixture of normal distributions
2024, Journal of Computational and Applied MathematicsRobust fitting of mixture models using weighted complete estimating equations
2022, Computational Statistics and Data AnalysisAn overview on the progeny of the skew-normal family— A personal perspective
2022, Journal of Multivariate AnalysisA non-iteration Bayesian sampling algorithm for robust seemingly unrelated regression models <sup>∗</sup>
2024, Computational StatisticsAssessment of extreme records in environmental data through the study of stochastic orders for scale mixtures of skew normal vectors
2024, Environmental and Ecological StatisticsModel-based clustering using a new multivariate skew distribution
2024, Advances in Data Analysis and Classification