A Bayesian model for longitudinal circular data based on the projected normal distribution
Introduction
Several approaches have been proposed to analyze longitudinal data. See, for example, Diggle et al. (2002), Fitzmaurice et al. (2004), Gelman and Hill (2007), and Hedeker and Gibbons (2006). All these books discuss models for longitudinal ‘scalar’ (i.e. linear) responses. In contrast, methodological proposals to describe relationships within repeated measurements of directional data are rather limited. This may be due to the difficulties in working with probability distributions commonly associated with directional data and to the intrinsic dependency inherent to longitudinal structures.
Circular data are a particular case of directional data. Specifically, circular data represent directions in two dimensions. For a survey of this and related topics, we refer the reader to Fisher (1993), Fisher et al. (1987), Jammalamadaka and SenGupta (2001) and Mardia and Jupp (2000). More recent contributions include Abe and Pewsey (2011), Oliveira et al. (2012), Pewsey (2008) and Qin et al. (2011). See also Arnold and SenGupta (2006) for an overview of applications of circular data analysis in ecological and environmental sciences. In many of these applications, the observations on the variable of interest are longitudinal in nature. For example, in studies concerning the orientation mechanism of birds, it is relevant to analyze the angular differences between a bird’s position at consecutive times after release (Artes and Jørgensen, 2000, Artes et al., 2000). In motion studies of small animals, the aim is often to describe the effect of covariates on the directional behavior of those animals (D’Elia, 2001, D’Elia et al., 2001). On the other hand, in the analysis of cell-cycle gene expression data, a problem of interest is to estimate the phase angles using information regarding the order among them (Rueda et al., 2009). All of these situations illustrate the need for models that allow us to analyze longitudinal structures where the response variable is circular.
As pointed out above, from a methodological point of view there does not seem to be a general framework for the analysis of longitudinal directional data. Longitudinal data where the response variable is circular have been analyzed using quasi-likelihood methods, such as the generalized estimating equations (GEE) originally proposed by Liang and Zeger (1986) to analyze linear data. Specifically, Artes et al. (2000) derive estimating equations for the parameters of a family of circular distributions and obtain asymptotic inference for the parameters of a mixed-effects model. In turn, Artes and Jørgensen (2000) extend GEE methods to deal with Jørgensen’s dispersion models (Jørgensen, 1997a, Jørgensen, 1997b) and employ this approach to model longitudinal circular data. They also present a simulation study for a simple model which only involves the mean direction and a single covariate. They note that in some situations their proposal may have troubles with convergence, and point out that their method requires either high correlations between the longitudinal observations or large samples in order to achieve satisfactory performance. On the other hand, D’Elia et al. (2001) propose a generalized linear model to study the directional behavior of sandhoppers under natural conditions in repeated trials. In the same vein, D’Elia (2001) assumes a variance components model to describe the orientation mechanism and uses a simulated maximum likelihood approach. She points out that the use of this approach may raise several problems. Recently, Song (2007) has used a generalized linear model approach where the random component belongs to the family of dispersion models. He suggests using penalized pseudo-likelihood and restricted maximum likelihood estimation to bypass the analytical difficulties arising from the nonlinearity of the corresponding score functions. Nevertheless, in some cases it is not possible to make inferences about all the parameters involved in the proposed models.
We feel that most of the procedures currently available for analyzing longitudinal data with circular responses suffer from certain flaws that render some of the required inferences unfeasible in general settings. These limitations include troubles for fitting, model comparison and prediction, as well as convergence problems of some of the iterative methods employed.
Song (2007) argues that different approaches may be required to analyze series of repeated measurements, depending on the length of the series. In the case of (several) short time series, modeling typically focuses on the relationship between the response variable and the corresponding covariates. In such situations the correlations are treated as nuisance parameters, as opposed to the case of a long time series where the correlations are usually modeled explicitly via a stochastic process.
In this paper, we introduce a new model to describe short series of longitudinal data where the response variable is circular. The model considers linear covariates and is based on a version of the projected bivariate normal distribution. In our proposal, each of the components of the model is specified by a mixed-effects linear model. In addition, we present a Bayesian analysis that allows us to make inference about any of the parameters of interest.
The paper is organized as follows. In the next section, we introduce the projected circular longitudinal model (henceforth called the PCL model) and describe some of its properties, including the longitudinal structures that can be obtained from it. In Section 3, we discuss a Bayesian analysis of the model and derive all the full conditionals needed for a Gibbs sampler. In Section 4, we present some illustrative examples. Finally, Section 5 contains some concluding remarks.
Section snippets
Description of the model
We start this section by briefly reviewing the projected normal distribution. For further details, we refer the reader to Mardia and Jupp (2000), Nuñez-Antonio and Gutiérrez-Peña (2005), Presnell et al. (1998), Presnell and Rumcheva (2008), and the references therein.
There are several ways of generating probability distributions for circular data. One relatively straightforward way is to radially project on the unit circle probability distributions originally defined on the plane. In the
Full conditional densities
Let be a set of observations from the PCL model. From the discussion in Section 2.1, and omitting the superscript for notational convenience, we can see that the full conditional densities for the parameters and latent variables of each of the components are given by and where and
We note that the
Examples
We used the R language and environment (R Development Core Team, 2012) to simulate the data sets for Example 1, Example 2, and to carry out all of the analyses in this section.
Example 1 For this illustration we simulated a longitudinal sample of size . This sample represents five repeated measurements on each of individuals. The data were obtained using the following specification of the PCL model: where
Concluding remarks
In this paper, we have introduced the PCL model, based on a projected normal distribution, to analyze short longitudinal series of circular data. Although the PCL model assumes a conditional independence structure on each of its components, it is quite flexible and can describe several distinct longitudinal patterns. It may also provide the basis for the analysis of (longer) time series of circular data. Furthermore, unlike currently available analyses of models for longitudinal circular data,
Acknowledgments
The work of the first author was financed by Grant I0010-150526 of the Programa de Estancias Postdoctoral y Sabáticas al Extranjero para la Consolidación de Grupos de Investigación from CONACYT, Mexico. Support from the Department of Statistics of the University Carlos III of Madrid is also gratefully acknowledged. The work of the second author was partially supported by Sistema Nacional de Investigadores, Mexico. The authors are grateful to an associate editor and two anonymous referees whose
References (37)
- et al.
Symmetric circular models through duplication and cosine perturbation
Computational Statistics and Data Analysis
(2011) - et al.
Orientation in talitrus saltator (montagu): trends in intrapopulations variability related to environmental and intrinsic factors
Journal of Experimental Marine Biology and Ecology
(1999) - et al.
Orientation of sandhoppers under conditions in repeated trials: an analysis using longitudinal directional data
Estuarine, Coastal and Shelf Science
(2001) - et al.
A plug-in rule for bandwidth selection in circular density estimation
Computational Statistics and Data Analysis
(2012) The wrapped stable family of distributions as a flexible model for circular data
Computational Statistics and Data Analysis
(2008)- et al.
The mean resultant length of the spherically projected normal distribution
Statistics and Probability Letters
(2008) - et al.
A nonparametric circular-linear multivariate regression model with a rule-of-thumb bandwidth selector
Computers and Mathematics with Applications
(2011) - et al.
Recent advances in the analyses of directional data in ecological and environmental sciences
Environmental and Ecological Statistics
(2006) - et al.
Longitudinal data estimating equations for dispersion models
Scandinavian Journal of Statistics
(2000) - et al.
Analysis of circular longitudinal data based on generalized estimating equations
Australian and New Zealand Journal of Statistics
(2000)
On MCMC sampling in hierarchical longitudinal models
Statistics and Computing
A statistical model for orientation mechanism
Statistical Methods and Applications
Analysis of Longitudinal Data
Statistical Analysis of Circular Data
Statistical Analysis of Spherical Data
Applied Longitudinal Analysis
Identifiability, improper priors and Gibbs sampling for generalized linear models
Journal of the American Statistical Association
Efficient parameterizations for normal linear mixed models
Biometrika
Cited by (21)
Teachers’ interpersonal relationships and instructional expertise: How are they related?
2020, Studies in Educational EvaluationCitation Excerpt :The analysis strategy intended to quantify the evidence in favor of the hypotheses at the student- and teacher-level. Hypotheses are examined using the parameters estimated within a set of Bayesian projected normal circular mixed-effects models (Nuñez-Antonio & Gutiérrez-Peña, 2014). In these models, the circular variable teachers’ interpersonal relationship is the dependent variable.
Bayesian estimation and hypothesis tests for a circular Generalized Linear Model
2017, Journal of Mathematical PsychologyCitation Excerpt :Second, the wrapping approach ‘wraps’ a univariate distribution around the circle by taking the modulus of data on the real line (Coles, 1998; Ferrari, unpublished). Third, the embedding approach projects points from a bivariate distribution to the circle (Hernandez-Stumpfhauser, Breidt, van der Woerd, et al., 2015; Maruotti, 2016; Nuñez-Antonio & Gutiérrez-Peña, 2014; Nuñez-Antonio, Gutiérrez-Peña, & Escarela, 2011; Wang & Gelfand, 2014). While the wrapping and embedding approach provide promising avenues of study in their own right, here attention is restricted to the intrinsic approach, as it might provide the most natural analysis of circular data.
Joint regression modelling of intensity and timing of accelerometer counts
2023, Statistics in MedicineThe importance of time of day for magnetic body alignment in songbirds
2022, Journal of Comparative Physiology A: Neuroethology, Sensory, Neural, and Behavioral Physiology