A Bayesian model for longitudinal circular data based on the projected normal distribution

https://doi.org/10.1016/j.csda.2012.07.025Get rights and content

Abstract

The analysis of short longitudinal series of circular data may be problematic and to some extent has not been fully developed. A Bayesian analysis of a new model for such data is presented. The model is based on a radial projection onto the circle of a particular bivariate normal distribution. Inference about the parameters of the model is based on samples from the corresponding joint posterior density, which are obtained using a Metropolis-within-Gibbs scheme after the introduction of suitable latent variables. The procedure is illustrated using both simulated data sets and a real data set previously analyzed in the literature.

Introduction

Several approaches have been proposed to analyze longitudinal data. See, for example, Diggle et al. (2002), Fitzmaurice et al. (2004), Gelman and Hill (2007), and Hedeker and Gibbons (2006). All these books discuss models for longitudinal ‘scalar’ (i.e. linear) responses. In contrast, methodological proposals to describe relationships within repeated measurements of directional data are rather limited. This may be due to the difficulties in working with probability distributions commonly associated with directional data and to the intrinsic dependency inherent to longitudinal structures.

Circular data are a particular case of directional data. Specifically, circular data represent directions in two dimensions. For a survey of this and related topics, we refer the reader to Fisher (1993), Fisher et al. (1987), Jammalamadaka and SenGupta (2001) and Mardia and Jupp (2000). More recent contributions include Abe and Pewsey (2011), Oliveira et al. (2012), Pewsey (2008) and Qin et al. (2011). See also Arnold and SenGupta (2006) for an overview of applications of circular data analysis in ecological and environmental sciences. In many of these applications, the observations on the variable of interest are longitudinal in nature. For example, in studies concerning the orientation mechanism of birds, it is relevant to analyze the angular differences between a bird’s position at consecutive times after release (Artes and Jørgensen, 2000, Artes et al., 2000). In motion studies of small animals, the aim is often to describe the effect of covariates on the directional behavior of those animals (D’Elia, 2001, D’Elia et al., 2001). On the other hand, in the analysis of cell-cycle gene expression data, a problem of interest is to estimate the phase angles using information regarding the order among them (Rueda et al., 2009). All of these situations illustrate the need for models that allow us to analyze longitudinal structures where the response variable is circular.

As pointed out above, from a methodological point of view there does not seem to be a general framework for the analysis of longitudinal directional data. Longitudinal data where the response variable is circular have been analyzed using quasi-likelihood methods, such as the generalized estimating equations (GEE) originally proposed by Liang and Zeger (1986) to analyze linear data. Specifically, Artes et al. (2000) derive estimating equations for the parameters of a family of circular distributions and obtain asymptotic inference for the parameters of a mixed-effects model. In turn, Artes and Jørgensen (2000) extend GEE methods to deal with Jørgensen’s dispersion models (Jørgensen, 1997a, Jørgensen, 1997b) and employ this approach to model longitudinal circular data. They also present a simulation study for a simple model which only involves the mean direction and a single covariate. They note that in some situations their proposal may have troubles with convergence, and point out that their method requires either high correlations between the longitudinal observations or large samples in order to achieve satisfactory performance. On the other hand, D’Elia et al. (2001) propose a generalized linear model to study the directional behavior of sandhoppers under natural conditions in repeated trials. In the same vein, D’Elia (2001) assumes a variance components model to describe the orientation mechanism and uses a simulated maximum likelihood approach. She points out that the use of this approach may raise several problems. Recently, Song (2007) has used a generalized linear model approach where the random component belongs to the family of dispersion models. He suggests using penalized pseudo-likelihood and restricted maximum likelihood estimation to bypass the analytical difficulties arising from the nonlinearity of the corresponding score functions. Nevertheless, in some cases it is not possible to make inferences about all the parameters involved in the proposed models.

We feel that most of the procedures currently available for analyzing longitudinal data with circular responses suffer from certain flaws that render some of the required inferences unfeasible in general settings. These limitations include troubles for fitting, model comparison and prediction, as well as convergence problems of some of the iterative methods employed.

Song (2007) argues that different approaches may be required to analyze series of repeated measurements, depending on the length of the series. In the case of (several) short time series, modeling typically focuses on the relationship between the response variable and the corresponding covariates. In such situations the correlations are treated as nuisance parameters, as opposed to the case of a long time series where the correlations are usually modeled explicitly via a stochastic process.

In this paper, we introduce a new model to describe short series of longitudinal data where the response variable is circular. The model considers linear covariates and is based on a version of the projected bivariate normal distribution. In our proposal, each of the components of the model is specified by a mixed-effects linear model. In addition, we present a Bayesian analysis that allows us to make inference about any of the parameters of interest.

The paper is organized as follows. In the next section, we introduce the projected circular longitudinal model (henceforth called the PCL model) and describe some of its properties, including the longitudinal structures that can be obtained from it. In Section  3, we discuss a Bayesian analysis of the model and derive all the full conditionals needed for a Gibbs sampler. In Section  4, we present some illustrative examples. Finally, Section  5 contains some concluding remarks.

Section snippets

Description of the model

We start this section by briefly reviewing the projected normal distribution. For further details, we refer the reader to Mardia and Jupp (2000), Nuñez-Antonio and Gutiérrez-Peña (2005), Presnell et al. (1998), Presnell and Rumcheva (2008), and the references therein.

There are several ways of generating probability distributions for circular data. One relatively straightforward way is to radially project on the unit circle probability distributions originally defined on the plane. In the

Full conditional densities

Let D={(r11,θ11),,(rNnN,θNnN)} be a set of observations from the PCL model. From the discussion in Section  2.1, and omitting the superscript k for notational convenience, we can see that the full conditional densities for the parameters and latent variables of each of the components k{I,II} are given by f(β|{bi},Ω,D)=Np(β|C1i=1NXitei,C),f(bi|β,Ω,D)=Nq(bi|Di1Zitẽi,Di)i=1,,N and f(Ω|{bi},D)=Wi(v+N,B+i=1Nbibit), where C=i=1NXitXi+A,ei=YiZibi,Di=ZitZi+Ω and eĩ=YiXiβi.

We note that the R

Examples

We used the R language and environment (R Development Core Team, 2012) to simulate the data sets for Example 1, Example 2, and to carry out all of the analyses in this section.

Example 1

For this illustration we simulated a longitudinal sample of size N=65. This sample represents five repeated measurements on each of N=65 individuals. The data were obtained using the following specification of the PCL model: YiI|βIN5(XiIβI,I),YiII|βII,{biII}N5(XiIIβII+ZiIIbiII,I),i=1,,65, where βI=(1,4,10,0.5)t,βII=(2,

Concluding remarks

In this paper, we have introduced the PCL model, based on a projected normal distribution, to analyze short longitudinal series of circular data. Although the PCL model assumes a conditional independence structure on each of its components, it is quite flexible and can describe several distinct longitudinal patterns. It may also provide the basis for the analysis of (longer) time series of circular data. Furthermore, unlike currently available analyses of models for longitudinal circular data,

Acknowledgments

The work of the first author was financed by Grant I0010-150526 of the Programa de Estancias Postdoctoral y Sabáticas al Extranjero para la Consolidación de Grupos de Investigación from CONACYT, Mexico. Support from the Department of Statistics of the University Carlos III of Madrid is also gratefully acknowledged. The work of the second author was partially supported by Sistema Nacional de Investigadores, Mexico. The authors are grateful to an associate editor and two anonymous referees whose

References (37)

  • S. Chib et al.

    On MCMC sampling in hierarchical longitudinal models

    Statistics and Computing

    (1999)
  • A. D’Elia

    A statistical model for orientation mechanism

    Statistical Methods and Applications

    (2001)
  • P.J. Diggle et al.

    Analysis of Longitudinal Data

    (2002)
  • N.I. Fisher

    Statistical Analysis of Circular Data

    (1993)
  • N.I. Fisher et al.

    Statistical Analysis of Spherical Data

    (1987)
  • G.M. Fitzmaurice et al.

    Applied Longitudinal Analysis

    (2004)
  • A.E. Gelfand et al.

    Identifiability, improper priors and Gibbs sampling for generalized linear models

    Journal of the American Statistical Association

    (1999)
  • A.E. Gelfand et al.

    Efficient parameterizations for normal linear mixed models

    Biometrika

    (1995)
  • Cited by (21)

    • Teachers’ interpersonal relationships and instructional expertise: How are they related?

      2020, Studies in Educational Evaluation
      Citation Excerpt :

      The analysis strategy intended to quantify the evidence in favor of the hypotheses at the student- and teacher-level. Hypotheses are examined using the parameters estimated within a set of Bayesian projected normal circular mixed-effects models (Nuñez-Antonio & Gutiérrez-Peña, 2014). In these models, the circular variable teachers’ interpersonal relationship is the dependent variable.

    • Bayesian estimation and hypothesis tests for a circular Generalized Linear Model

      2017, Journal of Mathematical Psychology
      Citation Excerpt :

      Second, the wrapping approach ‘wraps’ a univariate distribution around the circle by taking the modulus of data on the real line (Coles, 1998; Ferrari, unpublished). Third, the embedding approach projects points from a bivariate distribution to the circle (Hernandez-Stumpfhauser, Breidt, van der Woerd, et al., 2015; Maruotti, 2016; Nuñez-Antonio & Gutiérrez-Peña, 2014; Nuñez-Antonio, Gutiérrez-Peña, & Escarela, 2011; Wang & Gelfand, 2014). While the wrapping and embedding approach provide promising avenues of study in their own right, here attention is restricted to the intrinsic approach, as it might provide the most natural analysis of circular data.

    • The importance of time of day for magnetic body alignment in songbirds

      2022, Journal of Comparative Physiology A: Neuroethology, Sensory, Neural, and Behavioral Physiology
    View all citing articles on Scopus
    View full text