Robust estimation in multivariate heteroscedastic regression models with autoregressive covariance structures using EM algorithm

https://doi.org/10.1016/j.jmva.2022.105026Get rights and content

Abstract

In the analysis of repeated or clustered measurements, it is crucial to determine the dynamics that affect the mean, variance, and correlations of the data, which will be possible using appropriate models. One of these models is the joint mean–covariance model, which is a multivariate heteroscedastic regression model with autoregressive covariance structures. In these models, parameter estimation is usually carried on under normality assumption, but the resulting estimators will be very sensitive to the outliers or non-normality of data. In this study, we propose a robust alternative method and an EM algorithm for estimating the parameters of joint mean–covariance models. Robustification is achieved using a multivariate heavy-tailed distribution with the same number of parameters as the multivariate normal distribution. To simplify the estimation procedure, a modified Cholesky decomposition is adopted to factorize the dependence structure in terms of unconstrained autoregressive and scale innovation parameters. Also, the technique for the prediction of future responses is given. An intensive simulation study and a real data example are provided to demonstrate the performance of the proposed method.

Introduction

Repeated measures are prevalent in various fields, from biology to economics. The correlation between these measures is a function of the time difference or the location distance. Most experimental research includes study designs in which the same variable is repeatedly measured or observed over time for each subject in a balanced or unbalanced structure. The key feature of such data (longitudinal data) or more generally repeated measures is that responses within a subject are very likely to be correlated even though responses between subjects may be independent. Since misidentification of the covariance structure may lead to a significant loss of efficiency of the mean parameter estimators [45], taking into account the within-subject correlation is essential for modeling such data sets. Also, suppose these data sets contain certain missing values, outliers, and they are not normally distributed. In that case, the mean parameter estimators may be biased because of the misspecification of the covariance structure [7]. When the within-subject correlation depends on time or location, the classical models related to the time series (e.g., AR, MA, or exchangeable structure) may fail to flexibly represent the dynamic dependence structure [47]. Furthermore, since modeling covariance structures has been a direct interest in some areas such as social sciences, finance, economics, and geology [8], the joint modeling mean, and covariance models are commonly used for dynamic covariance modeling (see [6], [32], [37], [38], [39], [40], [49], [50], [51]). In multivariate data analysis, joint mean–covariance models can be considered a multivariate regression model in which the covariance matrix has a special structure. In particular, the mean vector is parameterized as a linear model, and the covariance matrix is formed to represent the heteroscedastic and autoregressive structure of the measurements within the same subject. It should be noted that the aforementioned model type can also be used for time and location-dependent data sets.

The positive-definiteness is a challenging constraint to realize in modeling the covariance matrix. To handle this difficulty, Pourahmadi [37], [38] has proposed a data-driven method based on the modified Cholesky decomposition and guaranteed the positive-definiteness of the estimated covariance at no additional computational cost. The advantages of such a method include the detailed statistical interpretation of the parameters and the assurance of positive-definiteness of the covariance structure. Pourahmadi [37], [38] has considered GLMs for the components of the modified Cholesky decomposition of the covariance matrix. Several authors have considered the joint mean and covariance models based on the modified Cholesky decomposition or its extensions [18], [32], [33]. These approaches are based on the normality assumption, which is well-known not to be realistic. Thus, they are sensitive to outliers, contamination, or heavy-tailed distributions. Since an outlier in the subject level may cause a set of outlying observations in the sample, extra effort should be given to combat outliers. Therefore, robust approaches should be considered for analyzing data sets that may contain some outliers or are not suitable for modeling with the normality assumption.

Concerning the robustness in this context, there are several approaches. For example, Fan et al. [11] have introduced a generalized estimating equation method based on the bounded scoring of the Huber function. Croux et al. [5] have developed a statistical procedure for robustifying the mean and covariance where they set up estimating equations for both the mean and the dispersion parameter. Zheng et al. [52], [53] have proposed robust generalized estimating equations for the joint mean–covariance model by employing the Huber’s function and leverage-based weights for the mean and covariance to achieve robustness against outliers. Then, Wang et al. [46] have proposed a robust variable selection approach by adopting the exponential squared loss function with a tuning parameter. The proposed approach not only achieves good robustness concerning outliers in the dataset but also is as asymptotically efficient as the least-squares estimation without outliers under normal error with a selected tuning parameter.

Maadooliat et al. [29]) have proposed a double-robust procedure for modeling the correlation matrix of a longitudinal dataset. Qin and Zhu [41] have further developed a robust method for simultaneous estimation of mean and covariance under the assumptions that random errors follow an elliptical distribution and all the subjects share the same within-subject covariance matrix, which does not depend on covariates. Lv et al. [28] have extended the bounded exponential score function in longitudinal data analysis and introduced efficient and robust generalized estimating equations. Lv et al. [27] have proposed a new adaptive robust procedure for bivariate longitudinal data utilizing Huber’s score function. Recently, Xu et al. [47] have proposed a robust estimation of the mean–covariance models using the maximum Lq-likelihood estimation method. Their paper has considered the mean–covariance modeling of the normal distribution. Still, instead of the ordinary maximum likelihood method, they used the maximum Lq-likelihood estimation method to obtain robust estimators for the parameters of interest. Most of the studies in this context are concerned with suggesting robust parameter estimation methods, except for the paper of Lin and Wang [25]. They have proposed a multivariate t-regression model that is useful when the data set has heavier tails than the normal distribution with its mean and scale covariance modeled jointly for longitudinal data analysis. However, this procedure is computationally intensive since the degrees of freedom parameter needs to be estimated along with the other parameters in the t-distribution case.

Similar to the t-distribution, the multivariate Laplace is an alternative heavy-tailed distribution to the multivariate normal distribution. However, the Laplace distribution has an advantage over the multivariate t-distribution concerning the number of parameters to be estimated. As a heavy-tailed distribution, the multivariate Laplace distribution has the same number of parameters as the multivariate normal distribution. This feature provides great convenience to the Laplace distribution in computational aspects. On the other hand, since the multivariate Laplace distribution is heavy-tailed, unlike the multivariate normal distribution, it will give robust estimators sensitive to outliers and heavy tailedness features of data sets. In recent years, this property of the Laplace distribution has been considered so that statistical modeling based on Laplace distribution and extensions thereof has developed rapidly in theory and applications [2], [9], [14], [19], [21], [22], [23], [26], [42], [48]. Kotz et al. [21] can be seen for the details of multivariate Laplace distribution’s properties, generalizations, and applications.

In this paper, inspired by the robustness property of the Laplace distribution, we propose a robust approach to the analysis of repeated measurements by extending Pourahmadi’s approach of joint mean and covariance parameterizations to general linear models with Laplace distributed error term, which will be so-called Laplace joint modeling model (LJMM). Our method’s advantage over the other proposed robust alternative (such as [25]) is that the Laplace distribution is a heavy-tailed alternative to the multivariate normal distribution and has an advantage over the t-distribution concerning the number of parameters that need to be estimated. Besides being robust, the number of parameters to be estimated is the same for the normal case.

The remainder of the present paper is structured as follows. In Section 2, we define the LJMM. We present three distinct representations of the log-likelihood function of LJMM and the maximum likelihood (ML) estimates of the parameters in Section 3. A Fisher scoring algorithm and Expectation–Maximization (EM) algorithm for the implementation of ML estimation are described in Section 4. Prediction of future values is discussed in Section 5. Some simulation studies are reported in Section 6. Section 7 includes a data example. Finally, some concluding remarks are briefly summarized in Section 8.

Section snippets

Multivariate Laplace distribution

There are different extensions of the univariate Laplace distributions called multivariate Laplace distribution. Ulrich and Chen [44] introduced the bivariate case of the Laplace distribution. The different forms in larger dimensions were discussed in the literature. Anderson [1] introduced a multivariate Laplace distribution as a special case of a multivariate Linnik distribution. Ernst [10] and Fernandez et al. [13] presented the multivariate Laplace distribution as a particular multivariate

ML estimation for the parameters

Assume that we have independent observations of m subjects Y1,,Ym from model (7). Then the likelihood and log-likelihood functions are Lβ,γ,λ=i=1mci|Σi|1/2exp12YiμiTΣi1Yiμi1/2,lβ,γ,λ=i=1mlnci12i=1mln|Σi|12i=1mYiμiTΣi1Yiμi1/2, where ci=Γni/2/2ni+1πni/2Γni. For the sake of simplicity let ri=YiXiβ=rijj=1ni be the vector of residuals, r̂i=r̂ijj=1ni=k=1j1rikzjkTγ be the predictor of ri, and n=i=1mni be the total number of observations. Using the following result, Liri=rir̂i,

Computational aspects of parameter estimation

As we mentioned in Section 3, likelihood equations cannot be explicitly solved. Therefore we have to use numerical algorithms to compute the ML estimates. One of the algorithms is Newton–Raphson or Fisher scoring algorithm used in the paper by [25]. Therefore, we can also use the Newton–Raphson algorithm for the Laplace case, which will be defined in this section. However, it is well known that the Newton–Raphson algorithm may be computationally intensive because of the computation of the

Prediction

In this section, following [25] the technique for the prediction of future responses under this model is considered. Let y˜i be a g×1 future observations of Yi, x˜i be a g×p matrix of prediction regressors corresponding to y˜i with Ey˜i=x˜iβ, w˜j and z˜jk denote a q×1 and a d×1 covariate vector associated with y˜i with lnσj2=w˜jTλ and ϕjk=z˜jkTγ for j{ni+1,,ni+g} and k{1,,j1} respectively. Moreover, we assume that the joint distribution for YiT,y˜iTT follows an ni+g-variate Laplace

Simulation study

In the simulation study, our objective is to compare the proposed model (LJMM) with NJMM and TJMMs in terms of modeling capacity and the predictive accuracy of LJMM over NJMM. Consistent with the literature, we carry on the theoretical part of the paper by assuming different ni for each subject, but we observe that estimation and computation will be simplified if ni’s are equal. Therefore, we only consider the balanced models in simulation and real data. We use the modified EM algorithm

Data example

This section uses the orthodontic growth curve data analyzed by [36] originally. It is an open data set located in (nlme) package in R [34]. The data show the dental growth of distances from the pituitary to the pterygomaxillary fissure (mm) measured repeatedly at ages 8, 10, 12, and 14 years of 11 females and 16 males. We want to compare the performances of NJMM and our proposed LJMM for this data set. See the lattice graph showing the growth curves of each subject in Fig. 1. The difference in

Conclusion

We have proposed a robust approach to joint modeling of mean and scale covariance using multivariate Laplace distribution, a very flexible distributional assumption, and adopted the modified Cholesky decomposition to factorize the dependence structure in terms of unconstrained autoregressive and scale innovation parameters. Our approach can be regarded as a robust extension of the normal joint mean and covariance model in which the modified Cholesky decomposition proposed by Pourahmadi [37] is

CRediT authorship contribution statement

Yesim Guney: Conceptualization, Methodology, Software, Writing – original draft. Olcay Arslan: Supervision, Validation, Writing – original draft, Editing. Fulya Gokalp Yavuz: Data curation, Software, Investigation, Writing – original draft, Editing.

Acknowledgments

We thank the Editor-in-Chief, and the two anonymous referees for their thorough reading of the former version of the paper. Their helpful comments and numerous suggestions led to a considerable improvement of the manuscript. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

References (53)

  • RossiE. et al.

    Model and distribution uncertainty in multivariate GARCH estimation: a Monte Carlo analysis

    Comput. Statist. Data Anal.

    (2010)
  • XuL. et al.

    Robust maximum lq-likelihood estimation of joint mean–covariance models for longitudinal data

    J. Multivariate Anal.

    (2019)
  • AroudiA. et al.

    Speech signal modeling using multivariate distributions

    EURASIP J. Audio Speech Music Process.

    (2015)
  • ArslanO.

    An alternative multivariate skew Laplace distribution: properties and estimation

    Stat. Pap.

    (2010)
  • CrouxC. et al.

    Robust estimation of mean and dispersion functions in extended generalized additive models

    Biometrics

    (2012)
  • DanielsM. et al.

    Modelling the random effects covariance matrix in longitudinal data

    Stat. Med.

    (2003)
  • DiggleP.T. et al.

    Nonparametric estimation of covariance structure in longitudinal data

    Biometrics

    (1998)
  • EltoftT. et al.

    On the multivariate Laplace distribution

    IEEE Sign. Process. Lett.

    (2006)
  • ErnstM.D.

    A multivariate generalized Laplace distribution

    Comput. Statist.

    (2006)
  • FangK.T. et al.

    Symmetric Multivariate and Related Distributions

    (1990)
  • FernandezC. et al.

    Modeling and inference with v-distributions

    J. Amer. Statist. Assoc.

    (1995)
  • GómezE. et al.

    A multivariate generalization of the power exponential family of distributions

    Commun. Stat.-Theory Methods

    (1998)
  • Gómez-Sanches-ManzanoE. et al.

    Sequences of elliptical distributions and mixtures of normal distributions

    J. Multivariate Anal.

    (1998)
  • Gómez-Sanches-ManzanoE. et al.

    Multivariate exponential power distributions as mixtures of normal distributions with Bayesian applications

    Commun. Stat.-Theory Methods

    (2008)
  • HuangJ.Z. et al.

    Covariance matrix selection and estimation via penalised normal likelihood

    Biometrika

    (2006)
  • KhodabinaM. et al.

    Some properties of generalized gamma distribution

    Math. Sci.

    (2010)
  • Cited by (0)

    View full text