Abstract
Gaussian graphical models are widely popular for studying the conditional dependence among random variables. By encoding conditional dependence as an undirected graph, Gaussian graphical models provide interpretable representations and insightful visualizations of the relationships among variables. However, time series data present additional challenges: the graphs can evolve over time—with changes occurring at unknown time points—and the data often exhibit heavy-tailed characteristics. To address these challenges, we propose dynamic and robust Bayesian graphical models that employ state-of-the-art hidden Markov models (HMMs) to introduce dynamics in the graph and heavy-tailed multivariate t-distributions for model robustness. The HMM latent states are linked both temporally and hierarchically for greater information sharing across time and between states. The proposed methods are computationally efficient and demonstrate excellent graph estimation on simulated data with substantial improvements over non-robust graphical models. We demonstrate our proposed approach on human hand gesture tracking data, and discover edges and dynamics with well explained practical meanings.






Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Cheok, M.J., Omar, Z., Jaward, M.H.: A review of hand gesture and sign language recognition techniques. Int. J. Mach. Learn. Cybern. 10(1), 131–153 (2019)
Cremaschi, A., Argiento, R., Shoemaker, K., Peterson, C., Vannucci, M.: Hierarchical normalized completely random measures for robust graphical modeling. Bayesian Anal. 14(4), 1271 (2019)
Danaher, P., Wang, P., Witten, D.M.: The joint graphical lasso for inverse covariance estimation across multiple classes. J. R. Stat. Soc. Ser. B Stat. Methodol. 76(2), 373 (2014)
Fan, J., Feng, Y., Wu, Y.: Network exploration via the adaptive Lasso and SCAD penalties. Ann. Appl. Stat. 3(2), 521 (2009)
Finegold, M., Drton, M.: Robust graphical modeling of gene networks using classical and alternative \(t\)-distributions. Ann. Appl. Stat. 5(2A), 1057–1080 (2011)
Finegold, M., Drton, M., et al.: Robust Bayesian graphical modeling using Dirichlet \( t \)-distributions. Bayesian Anal. 9(3), 521–550 (2014)
Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis. Chapman and Hall/CRC, London (1995)
Geweke, J., et al.: Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments, vol. 196. Federal Reserve Bank of Minneapolis, Research Department Minneapolis (1991)
Gibberd, A.J., Nelson, J.D.: Regularized estimation of piecewise constant Gaussian graphical models: the group-fused graphical lasso. J. Comput. Graph. Stat. 26(3), 623–634 (2017)
Gottard, A., Pacillo, S.: Robust concentration graph model selection. Comput. Stat. Data Anal. 54(12), 3070–3079 (2010)
Han, T.X., Ning, H., Huang, T.S.: Efficient nonparametric belief propagation with application to articulated body tracking. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 1, pp. 214–221. IEEE (2006)
Ishwaran, H., James, L.F.: Gibbs sampling methods for stick-breaking priors. J. Am. Stat. Assoc. 96(453), 161–173 (2001)
Jasra, A., Holmes, C.C., Stephens, D.A.: Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling. Stat. Sci. 20(1), 50–67 (2005)
Kendon, A.: Gesture: Visible Action as Utterance. Cambridge University Press, Cambridge (2004)
Kim, D., Song, J., Kim, D.: Simultaneous gesture segmentation and recognition based on forward spotting accumulative HMMs. Pattern Recognit. 40(11), 3012–3026 (2007)
Kolar, M., Song, L., Ahmed, A., Xing, E.P., et al.: Estimating time-varying networks. Ann. Appl. Stat. 4(1), 94–123 (2010)
Kotz, S., Nadarajah, S.: Multivariate \(t\)-Distributions and Their Applications. Cambridge University Press, Cambridge (2004)
Lenkoski, A., Dobra, A.: Computational aspects related to inference in Gaussian graphical models with the G-Wishart prior. J. Comput. Graph. Stat. 20(1), 140–157 (2011)
Liu, X., Daniels, M.J.: A new algorithm for simulating a correlation matrix based on parameter expansion and reparameterization. J. Comput. Graph. Stat. 15(4), 897–914 (2006)
Liu, Y., Wichura, M.J., Drton, M.: Rejection sampling for an extended Gamma distribution. Submitted (2012)
Lun, R., Zhao, W.: A survey of applications and human motion recognition with microsoft kinect. Int. J. Pattern Recognit. Artif. Intell. 29(05), 1555008 (2015)
Madeo, R.C., Lima, C.A., Peres, S.M.: Gesture unit segmentation using support vector machines: segmenting gestures from rest positions. In: Proceedings of the 28th Annual ACM Symposium on Applied Computing, pp. 46–52 (2013)
Meinshausen, N., Bühlmann, P., et al.: High-dimensional graphs and variable selection with the lasso. Ann. Stat. 34(3), 1436–1462 (2006)
Mitra, S., Acharya, T.: Gesture recognition: a survey. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 37(3), 311–324 (2007)
Miyamura, M., Kano, Y.: Robust Gaussian graphical modeling. J. Multivar. Anal. 97(7), 1525–1550 (2006)
Neal, R.M.: Markov chain sampling methods for Dirichlet process mixture models. J. Comput. Graph. Stat. 9(2), 249–265 (2000)
Ni, Y., Baladandayuthapani, V., Vannucci, M., Stingo, F.C.: Bayesian graphical models for modern biological applications. In: Statistical Methods and Applications, pp. 1–29 (2021)
Osborne, N., Peterson, C.B., Vannucci, M.: Latent network estimation and variable selection for compositional data via variational EM. J. Comput. Graph. Stat. 31(1), 163–175 (2022)
Peng, J., Wang, P., Zhou, N., Zhu, J.: Partial correlation estimation by joint sparse regression models. J. Am. Stat. Assoc. 104(486), 735–746 (2009)
Peterson, C.B., Osborne, N., Stingo, F.C., Bourgeat, P., Doecke, J.D., Vannucci, M.: Bayesian modeling of multiple structural connectivity networks during the progression of Alzheimer’s disease. Biometrics 76(4), 1120–1132 (2020)
Pinheiro, J.C., Liu, C., Wu, Y.N.: Efficient algorithms for robust estimation in linear mixed-effects models using the multivariate \(t\) distribution. J. Comput. Graph. Stat. 10(2), 249–276 (2001)
Rodríguez, C.E., Walker, S.G.: Label switching in Bayesian mixture models: deterministic relabeling strategies. J. Comput. Graph. Stat. 23(1), 25–45 (2014)
Roverato, A.: Hyper inverse Wishart distribution for non-decomposable graphs and its application to Bayesian inference for Gaussian graphical models. Scand. J. Stat. 29(3), 391–411 (2002)
Scott, S.L.: Bayesian methods for hidden Markov models: recursive computing in the 21st century. J. Am. Stat. Assoc. 97(457), 337–351 (2002)
Shaddox, E., Stingo, F.C., Peterson, C.B., Jacobson, S., Cruickshank-Quinn, C., Kechris, K., Bowler, R., Vannucci, M.: A Bayesian approach for learning gene networks underlying disease severity in COPD. Stat. Biosci. 10(1), 59–85 (2018)
Sigal, L., Isard, M., Haussecker, H., Black, M.J.: Loose-limbed people: estimating 3d human pose and motion using non-parametric belief propagation. Int. J. Comput. Vis. 98(1), 15–48 (2012)
Song, Y., Goncalves, L., Perona, P.: Unsupervised learning of human motion. IEEE Trans. Pattern Anal. Mach. Intell. 25(7), 814–827 (2003)
Sperrin, M., Jaki, T., Wit, E.: Probabilistic relabelling strategies for the label switching problem in Bayesian mixture models. Stat. Comput. 20(3), 357–366 (2010)
Sun, H., Li, H.: Robust Gaussian graphical modeling via \(\ell _1\) penalization. Biometrics 68(4), 1197–1206 (2012)
Tan, L.S., Jasra, A., De Iorio, M., Ebbels, T.M.: Bayesian inference for multiple Gaussian graphical models with application to metabolic association networks. Ann. App. Stat. 11(4), 2222–2251 (2017)
Vinciotti, V., Hashem, H.: Robust methods for inferring sparse network structures. Comput. Stat. Data Anal. 67, 84–94 (2013)
Wang, H.: Bayesian graphical lasso models and efficient posterior computation. Bayesian Anal. 7(4), 867–886 (2012)
Wang, H.: Scaling it up: stochastic search structure learning in graphical models. Bayesian Anal. 10(2), 351–377 (2015)
Warnick, R., Guindani, M., Erhardt, E., Allen, E., Calhoun, V., Vannucci, M.: A Bayesian approach for estimating dynamic functional network connectivity in fMRI data. J. Am. Stat. Assoc. 113(521), 134–151 (2018)
Xu, R., Wu, J., Yue, X., Li, Y.: Online structural change-point detection of high-dimensional streaming data via dynamic sparse subspace learning. In: Technometrics, pp. 1–14 (2022)
Yang, E., Lozano, A.C.: Robust Gaussian graphical modeling with the trimmed graphical lasso. In: Advances in Neural Information Processing Systems, pp. 2602–2610 (2015)
Yang, J., Peng, J.: Estimating time-varying graphical models. J. Comput. Graph. Stat. 29(1), 191–202 (2020)
Zhang, C., Yan, H., Lee, S., Shi, J.: Dynamic multivariate functional data modeling via sparse subspace learning. Technometrics 63(3), 370–383 (2021)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose. The authors have no competing interests to declare that are relevant to the content of this article. All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript. The authors have no financial or proprietary interests in any material discussed in this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 MCMC for the dynamic classical-t graphical model
-
Update \({\varvec{\tau }}\): For \(t=1,2,\ldots ,T\), sample \(\tau _t\) from
$$\begin{aligned} \text {Gamma}\left( \frac{\nu +p}{2}, \frac{\nu + {\varvec{y}}_t' {\varvec{\Omega }}_{s_t} {\varvec{y}}_t }{2}\right) , \end{aligned}$$where the second parameter is the rate.
-
Update \({{\varvec{\Omega }}}\) and \({{\varvec{G}}}\): For each \(s \in \{1,...,S\}\), let \(n_s\) denote the number of time points in state s. Compute \({\varvec{\Sigma }}_s = \sum _{t: s_t=s} \tau _t {\varvec{y}}_t {\varvec{y}}_t'\) and perform the following updates:
-
1.
Following Peterson et al. (2020), for \(s \in \{1,..,S\}\), sample \({{\varvec{\Omega }}}_{s}\) one column and row at a time. Here we demonstrate how to sample the last row and last column in the last state. First, partition \({\varvec{\Omega }}_S\) into
$$\begin{aligned} {\varvec{\Omega }}_S = \begin{pmatrix} {\varvec{\Omega }}_{S,11} &{} {\varvec{\omega }}_{S,12} \\ {\varvec{\omega }}_{S,21} &{} \omega _{S,pp} \end{pmatrix} \end{aligned}$$where \(\omega _{S,pp}\) is the (p, p)th element of the matrix, \({\varvec{\omega }}_{S,12}\) is the last column except \(\omega _{S,pp}\) and \({\varvec{\omega }}_{S,21}\) is the last row except \(\omega _{S,pp}\). Similarly, partition \({\varvec{\Theta }}_{ij}\), \(i,j=1,\ldots ,p\), and \({\varvec{\Sigma }}_S\) into
$$\begin{aligned} {\varvec{\Theta }}_{ij} = \begin{pmatrix} {\varvec{\Theta }}_{ij,11} &{} {\varvec{\theta }}_{ij,12} \\ {\varvec{\theta }}_{ij,21} &{} \theta _{ij,SS} \end{pmatrix} , \quad {\varvec{\Sigma }}_S = \begin{pmatrix} {\varvec{\Sigma }}_{S,11} &{} {\varvec{\sigma }}_{S,12} \\ {\varvec{\sigma }}_{S,21} &{} \sigma _{S,pp} \end{pmatrix} . \end{aligned}$$Next, perform a one-on-one transformation of the last column of \({\varvec{\Omega }}_S\) as \(({\varvec{a}} , b) = ({\varvec{\omega }}_{S,12}, \omega _{S,pp} - {\varvec{\omega }}_{S,21} {\varvec{\Omega }}_{S,11}^{-1} {\varvec{\omega }}_{S,12} )\). It can be proved that the conditional posterior of \(({\varvec{a}}, b)\) is
$$\begin{aligned} \begin{aligned}&p({\varvec{a}}, b | {\varvec{\Omega }}_{S,11}, {\varvec{\Omega }}_1,\ldots ,{\varvec{\Omega }}_{S-1}, {\varvec{G}}_1,\ldots ,{\varvec{G}}_S, {\varvec{\Phi }}, {\varvec{s}}, {\varvec{Y}})\\&\quad = \text {Normal}({\varvec{Q}}^{-1} {\varvec{l}}, {\varvec{Q}}^{-1}) \text {Gamma}\left( \dfrac{n_S}{2}+1, \dfrac{\sigma _{S,pp}+\uplambda }{2}\right) \end{aligned} \nonumber \\ \end{aligned}$$(14)where \({\varvec{l}} = {\varvec{D}}^{-1}{\varvec{m}} - {\varvec{\sigma }}_{S,12}\), \({\varvec{Q}} = {\varvec{D}}^{-1} + (\sigma _{S,pp}+\uplambda ) {\varvec{\Omega }}_{S,11}^{-1}\), and the second parameter in the gamma distribution is rate. Here \({\varvec{D}}\) is the covariance matrix of \({\varvec{\omega }}_{S,12}\), which off diagonal elements are 0 and the jth on-diagonal element is \(\theta _{jp,SS} - {\varvec{\theta }}_{jp,21} {\varvec{\Theta }}_{jp,11}^{-1} {\varvec{\theta }}_{jp,12}\), \(j=1,\ldots ,p-1\). \({\varvec{m}}\) is the mean of \({\varvec{\omega }}_{S,12}\), which jth element is \({\varvec{\theta }}_{jp,21} {\varvec{\Theta }}_{jp,11}^{-1} (\omega _{1,jp}, \omega _{2,jp},\ldots ,\omega _{S-1,jp})'\), \(j=1,\ldots ,p-1\). The positive definiteness of \({\varvec{\Omega }}_S\) is automatically guaranteed by the Sylvester’s criterion. Assume the sampled \({\varvec{\Omega }}_S\) at the nth iteration, denoted as \({\varvec{\Omega }}_S^{(n)}\), is positive definite. After updating its last column and row at the \((n+1)\)th iteration, the leading \(p-1\) principal minors of \({\varvec{\Omega }}_S^{(n+1)}\) are still positive. Note that the principal minor of pth order in \({\varvec{\Omega }}_S^{(n+1)}\) is \(b \cdot |{\varvec{\Omega }}_{S, 11}^{(n)}|\). Since b is positive with a Gamma distribution, the pth leading principal minor is positive and \({\varvec{\Omega }}_S^{(n+1)}\) is positive definite.
-
2.
Sample \({\varvec{G}}_s\), \(s=1,\ldots ,S\) from the conditional Bernoulli distributions
$$\begin{aligned} p(g_{s,ij}= & {} 1|{\varvec{\Omega }}_1,\ldots ,{\varvec{\Omega }}_S, {\varvec{\Phi }}, {\varvec{s}}, {\varvec{Y}})\\= & {} \dfrac{N(\omega _{s,ij}|0,v_1^2)\pi }{N(\omega _{s,ij}|0,v_1^2)\pi +N(\omega _{s,ij}|0,v_0^2)(1-\pi )}. \end{aligned}$$
-
1.
-
Update \({\varvec{\Phi }}\): This is an MH-step based on the PX-RPMH algorithm of Liu and Daniels (2006). Compute the diagonal matrix V where \(V_{ss} =(\sum _{i=1}^{p-1} \sum _{j=i+1}^p \omega _{s,ij})^{-\frac{1}{2}}\), \(s = 1,...,S\), and transfer \({\varvec{\Phi }}\) to \({\varvec{\Psi }}= V {\varvec{\Phi }}V\). Define \({\varvec{\epsilon }}_{ij} = V {\varvec{\omega }}_{ij}\), \(i,j=1,\ldots ,p\). Given the candidate prior \(p({\varvec{\Phi }}) \propto |{\varvec{\Phi }}|^{-\frac{S+1}{2}} {\varvec{1}}_{{\varvec{\Phi }}\in \mathcal {R^{S}}}\), the posterior of \({\varvec{\Psi }}\) is an Inverse-Wishart distribution with degrees of freedom \(p(p-1)/2\) and scale matrix \(\sum _{i=1}^{p-1} \sum _{j=i+1}^p \text {diag}({\varvec{v}}_{ij})^{-1} {\varvec{\epsilon }}_{ij} {\varvec{\epsilon }}_{ij}' \text {diag}({\varvec{v}}_{ij})^{-1}\). Sample \({\varvec{\Psi }}\) and set \({\varvec{\Phi }}^* = \text {diag}({\varvec{\Psi }})^{-\frac{1}{2}} {\varvec{\Psi }}\text {diag}({\varvec{\Psi }})^{-\frac{1}{2}} \). Accept the proposed \({\varvec{\Phi }}^*\) with probability
$$\begin{aligned} \text {min}\left( 1, \text {exp}\left[ \dfrac{S+1}{2} \left( \text {log}|{\varvec{\Phi }}^*| - \text {log}|{\varvec{\Phi }}| \right) \right] \right) . \end{aligned}$$ -
Update \(\textbf{P}\) and \(\textbf{s}\): This is a Metropolis-Hastings step following Scott (2002). First sample the state transition probabilities for each state s from a \(\text {Dirichlet}({\varvec{\alpha }}_s)\) distribution where \(\alpha _{rs}= (\# r \rightarrow s) +1\), \(r = 1,...,S\), is the number of state transitions from r to s plus 1. Let \(\pi _{new}\) denote the stationary distribution given the transition probabilities. Accept the resulting transition matrix and stationary distribution with probability
$$\begin{aligned} \text {min} \left\{ 1, \frac{\pi _{new}(State \quad at \quad the \quad first \quad time \quad point)}{\pi _{old}(State \quad at \quad the \quad first \quad time \quad point )} \right\} . \end{aligned}$$Next, perform a Gibbs step using Forward Propagate Backward sampling. Obtain the forward transition probabilities for each state at each time point as \({\varvec{Q}}_{t}=[q_{trs}]\) by
$$\begin{aligned} q_{trs}\propto \pi _{t-1}(r|...) p_{rs} N( \sqrt{\tau _t} {\varvec{y}}_t |{\varvec{0}},{\varvec{\Omega }}_s^{-1}), \end{aligned}$$where \(q_{trs}\) is the transition probability from r at time \(t-1\) to s at time t; compute \(\pi _{t}(r|...) = \sum _{s = 1}^S q_{tsr}\). Sample \(s_{T}\) from \(\pi _{T}\) and then recursively sample \(s_{t}\) proportional to the \(s_{t+1}\)-th column of \({\varvec{Q}}_{t+1}\) until \(s_{1}\) is sampled.
1.2 MCMC for the dynamic Dirichlet-t graphical model
-
Update \({{\varvec{\Omega }}}\) and \({{\varvec{G}}}\): Same as the classical-t model with \({\varvec{\Sigma }}_s = \sum _{t: s_t=s} \text {diag}(\sqrt{{\varvec{\tau }}_t}) {\varvec{y}}_t {\varvec{y}}_t'\text {diag}(\sqrt{{\varvec{\tau }}_t})\) for \(s \in \{1,\ldots ,S\}\).
-
Update \({{\varvec{\Phi }}}\): Same as the classical-t model.
-
Update \(\textbf{P}\) and \(\textbf{s}\): Same as the classical t model with \(p_{trs} \propto \pi _{t-1}(r|\theta )q(r,s) N( \text {diag}(\sqrt{{\varvec{\tau }}_t}){\varvec{y}}_t |{\varvec{0}},{\varvec{\Omega }}_s^{-1} )\).
-
Update \({\varvec{\tau }}\): We use the truncated stick-breaking process to estimate the Dirichlet process (Ishwaran and James 2001). In simulation studies, we found that fixing the number of possible clusters reduces the MCMC processing time by up to 90%, as we avoid frequent sampling of the probability that \(\tau _{tj}\) is in a new cluster, which requires evaluating the pdf of noncentral t-distributions. Given the pre-specified level of truncation K, let \({\varvec{Z}}\) be a \(T \times p\) matrix of cluster indicators and let \(n_{tk}\) be the size of cluster k at time t, \(k = 1, \ldots , K\), \(t = 1,...,T\). \({\varvec{\eta }}_t = (\eta _{t1},\ldots ,\eta _{t K})\) denotes the unique cluster values of \({\varvec{\tau }}_t\). Define the cluster weights as \(\{w_{tk}\}_{t \in \{1,\ldots ,T\}, k \in \{1,\ldots ,K\} }\) and perform the following steps:
-
1.
For each \(t=1,\ldots ,T\) and \(j=1,\ldots ,p\), draw \(z_{tj}\) and update \(x_{tj} = y_{tj} * \sqrt{\eta _{t, z_{tj}}}\). The probability that it belongs to cluster k, \(k=1,\ldots ,K\), is
$$\begin{aligned} p(z_{tj}= k | {\varvec{z}}_{t,-j }, {\varvec{\eta }}_t, {\varvec{y}}_t, {\varvec{\Omega }}_{s_t}, {\varvec{w}}_t) \propto w_{tk} \text {N}\left( y_{tj} | \frac{\mu _c}{\sqrt{\eta _{tk}}}, \frac{\sigma _c^2}{\eta _{tk}}\right) , \end{aligned}$$where N\((x|\mu ,\sigma ^2)\) is the Normal(\(\mu ,\sigma ^2\)) pdf evaluated at x. Here \(\mu _c\) and \(\sigma _c^2\) are the conditional mean and variance of \(x_{tj}|{\varvec{x}}_{t,-j}\), where \({\varvec{x}}_{t,-j} = \{x_{ti}:i\ne j, i \in \{1,\ldots , p\} \}\) and \({\varvec{x}}_t \sim MVN({\varvec{0}}, {\varvec{\Omega }}_{s_t}^{-1})\)
-
2.
For each \(t=1,\ldots ,T\) and \(k=1,\ldots ,K-1\), sample the stick-breaking weight \(v_{tk}\) from Beta\((1+n_{tk}, \alpha + \sum _{k'= k+1}^{K} n_{t k'})\). Compute \(w_{tk} = v_{tk} \prod _{k'<k} (1-v_{tk})\).
-
3.
For each \(t=1,\ldots ,T\) and each \(k=1,\ldots ,K\), draw \(\eta _{tk}\) from distribution
$$\begin{aligned} f(\eta _{tk}| {\varvec{\eta }}_{t,-k}, {\varvec{z}}_t, {\varvec{y}}_t, {\varvec{\Omega }}_{s_t}) \propto \eta _{tk}^{a-1} \text {exp}\{ -b \eta _{tk} -c \sqrt{\eta _{tk}} \} \end{aligned}$$(15)with \(a = \frac{\nu +n_k}{2}\), \(b=\frac{1}{2}[\nu + \text {tr}({\varvec{\Omega }}_{(k)(k)} {\varvec{y}}_{(k)} {\varvec{y}}_{(k)}' )]\), \(c = \text {tr}( {\varvec{\Omega }}_{(k)(-k)} {\varvec{x}}_{(-k)} {\varvec{y}}_{(k)}' )\), \((k) = \{j: z_j=k, j \in \{1,\ldots ,p\}\}\) and \((-k) = \{j: z_j\ne k, j \in \{ 1,\ldots ,p\}\}\) (t and \(s_t\) are dropped here for simplicity). In order to sample \(\eta _{tk}\), we do a variable transformation \(\tau ^* = b \eta _{tk}\) and get
$$\begin{aligned} f(\tau ^*| \ldots ) \propto \tau ^{*(a^*-1)} \text {exp}\{ -\tau ^* -2 c^*\sqrt{\tau ^*} \} \end{aligned}$$with \(a^*=a\) and \(c^*=\frac{c}{2\sqrt{b}} \). Next, use the rejection algorithm from Liu et al. (2012) to sample \(\tau ^*\):
-
(a)
When \(c^*/\sqrt{a^*}\le -0.7\), use Algorithm 2 from Liu et al. (2012). Compute \(m=\frac{2 a^*-1}{c^* + \sqrt{(c^*)^2 + 4 a^* -2}}\). Repeatedly generate \(X \sim N (m, \frac{1}{2} )\) and \(U\sim \) Uniform(0, 1) until \(X>0\) and \(m^{2a^*-1}U< X^{2a^*-1}\text {exp}\{ -2(m+c^*)(X-m)\}\). Accept \(X^2\).
-
(b)
When \(-0.7<c^*/\sqrt{a^*}< 0\), use the first part of Algorithm 1 from Liu et al. (2012). Compute \(m=\frac{4a^*}{( \sqrt{(c^*)^2+4a^*}-c^*)^2}\). Repeatedly generate \(X\sim \)Gamma\((a^*, m)\) (m is the rate parameter) and \(U\sim \) Uniform(0, 1) until \(U< \text {exp} \{a^*+X(m-1)-2\sqrt{X}c^* - a^*/m)\}\). Accept X.
-
(c)
When \(0\le c^*/\sqrt{a^*}< 0.7\), use the second part of Algorithm 1 from Liu et al. (2012). Repeatedly generate \(X\sim \)Gamma\((a^*,1)\) and \(U\sim \)Uniform(0, 1) until \(U< \text {exp}(-2 c^* \sqrt{X})\). Accept X.
-
(d)
When \(0.7\le c^*/\sqrt{a^*}\), use Algorithm 3 from Liu et al. (2012). Compute \(m= c^*+\sqrt{(c^*)^2+4a^*}\). Repeatedly generate \(X\sim \)Gamma\((2a^*,m)\) and \(U\sim \)Uniform(0, 1) until \(U<\text {exp}\{ - (X-\frac{m}{2}+c^*)^2\}\). Accept \(X^2\).
-
(e)
\(\eta _{tk} = \tau ^* / b\).
-
(f)
Update \(x = \sqrt{\tau } y \) at each k.
-
(a)
-
1.
-
Update \({\varvec{\alpha }}\): Sample \(\alpha \) from
$$\begin{aligned} \alpha | \ldots \sim \text {Gamma}\left( a_{\alpha } + T \cdot (K - 1), b_{\alpha } - \sum _{t=1}^T \sum _{k=1}^{K - 1} log ( 1 - v_{tk}) \right) . \end{aligned}$$
1.3 Parameter selection for the fused graphical lasso
In the simulation study, we evaluate the performance of the fused graphical lasso (fLasso; Danaher et al. 2014) by treating the true hidden states as known. The fLasso contains two tuning parameters: \(\uplambda _1\) controls the graphical lasso penalty (i.e., graph sparsity) and \(\uplambda _2\) controls the fused lasso penalty (i.e., graph similarity).
To estimate a single graph, we select both \(\uplambda _1\) and \(\uplambda _2\) from a grid of values based on AIC, as done in Danaher et al. (2014). Specifically, for each value of \(\uplambda _2\) on a grid, we define the grid of \(\uplambda _1\) such that the graph ranges from empty to full. After fitting the fLasso for each \((\uplambda _1, \uplambda _2)\) pair, we return the estimated graph for which the AIC is smallest.
To compute an ROC curve for fLasso, we again vary \(\uplambda _2\) on a grid, but now compute the AUC (across values of \(\uplambda _1\) ranging from the empty graph to the full graph) and select the value of \(\uplambda _2\) for which the AUC is largest.
1.4 Simulation study results when \(p = 10\)
Classical-t | Slightly contaminated | Highly contaminated | |||||||
---|---|---|---|---|---|---|---|---|---|
TPR | FPR | MCC | TPR | FPR | MCC | TPR | FPR | MCC | |
dGM | 0.58 | 0.21 | 0.36 | 0.56 | 0.22 | 0.35 | 0.38 | 0.29 | 0.09 |
dCT | 0.98 | 0.01 | 0.97 | 0.81 | 0.10 | 0.71 | 0.68 | 0.16 | 0.52 |
dDT | 0.98 | 0.01 | 0.97 | 0.97 | 0.01 | 0.96 | 0.97 | 0.02 | 0.95 |
Performance comparisons for hidden state estimation over the three hidden states, for the three simulation scenarios, \(p = 10\)
Classical-t | Slightly contaminated | Highly contaminated | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
TPR | FPR | MCC | FL | TPR | FPR | MCC | FL | TPR | FPR | MCC | FL | |
fLasso | 0.99 | 0.87 | 0.20 | 0.35 | 0.97 | 0.72 | 0.31 | 0.45 | 0.90 | 0.63 | 0.26 | 0.65 |
dGM | 0.45 | 0.02 | 0.54 | 0.38 | 0.42 | 0.02 | 0.51 | 0.38 | 0.31 | 0.03 | 0.39 | 0.51 |
dCT | 0.92 | 0.01 | 0.93 | 0.06 | 0.67 | 0.03 | 0.70 | 0.42 | 0.58 | 0.05 | 0.60 | 0.37 |
dDT | 0.92 | 0.01 | 0.93 | 0.09 | 0.91 | 0.02 | 0.91 | 0.25 | 0.85 | 0.01 | 0.86 | 0.18 |
Performance comparisons for graph estimation, \(p = 10\)
1.5 List of variables in the hand gesture data set
Code | Variable name | Code | Variable name |
---|---|---|---|
1 | Vectorial velocity of left hand (x coordinate) | 17 | Vectorial acceleration of right hand (y coordinate) |
2 | Vectorial velocity of left hand (y coordinate) | 18 | Vectorial acceleration of right hand (z coordinate) |
3 | Vectorial velocity of left hand (z coordinate) | 19 | Vectorial acceleration of left wrist (x coordinate) |
4 | Vectorial velocity of right hand (x coordinate) | 20 | Vectorial acceleration of left wrist (y coordinate) |
5 | Vectorial velocity of right hand (y coordinate) | 21 | Vectorial acceleration of left wrist (z coordinate) |
6 | Vectorial velocity of right hand (z coordinate) | 22 | Vectorial acceleration of right wrist (x coordinate) |
7 | Vectorial velocity of left wrist (x coordinate) | 23 | Vectorial acceleration of right wrist (y coordinate) |
8 | Vectorial velocity of left wrist (y coordinate) | 24 | Vectorial acceleration of right wrist (z coordinate) |
9 | Vectorial velocity of left wrist (z coordinate) | 25 | Scalar velocity of left hand |
10 | Vectorial velocity of right wrist (x coordinate) | 26 | Scalar velocity of right hand |
11 | Vectorial velocity of right wrist (y coordinate) | 27 | Scalar velocity of left wrist |
12 | Vectorial velocity of right wrist (z coordinate) | 28 | Scalar velocity of right wrist |
13 | Vectorial acceleration of left hand (x coordinate) | 29 | Scalar acceleration of left hand |
14 | Vectorial acceleration of left hand (y coordinate) | 30 | Scalar acceleration of right hand |
15 | Vectorial acceleration of left hand (z coordinate) | 31 | Scalar acceleration of left wrist |
16 | Vectorial acceleration of right hand (x coordinate) | 32 | Scalar acceleration of right wrist |
Variables and their abbreviations
1.6 Results from the fused graphical lasso model on the hand gesture data set
In the application study, we fit the fused graphical lasso model given the two states generated by the proposed dynamic Dirichlet-t graphical model. The optimal graphs chosen by AIC (Danaher et al. 2014) are full graphs. To reach a desired level of sparsity, we adjust the two fLasso tuning parameters until we get a similar number of edges as the Dirichlet-t model. The estimated graphs are plotted below. The graphs in the two states include all edges that describe the positive correlation of velocity between the left hand and the left wrist, the positive correlation of acceleration between the left hand and the left wrist, and the same positive correlations on the right hand side. However, many expected edges are not detected. Thus, the fLasso results—even with the states supplied—are unsatisfactory.

Posterior partial correlations in each hidden state from the fused graphical lasso model. Blue lines indicate positive correlations and red lines indicate negative correlations; line thickness represents the correlation strength; a dotted line indicates that the edge is unique to the state
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, C., Kowal, D.R. & Vannucci, M. Dynamic and robust Bayesian graphical models. Stat Comput 32, 105 (2022). https://doi.org/10.1007/s11222-022-10177-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-022-10177-0