Elsevier

NeuroImage

Volume 100, 15 October 2014, Pages 301-315
NeuroImage

Non-parametric Bayesian graph models reveal community structure in resting state fMRI

https://doi.org/10.1016/j.neuroimage.2014.05.083Get rights and content

Highlights

  • Three nonparametric Bayesian models for node clustering are used to model rs-fMRI.

  • Models' predictability and reproducibility are extensively evaluated using resampling.

  • The community structure model shows better predictability and reproducibility.

  • This finding suggests that rs-fMRI graphs exhibit community structure.

  • Modeling between-cluster link probabilities adds important information.

Abstract

Modeling of resting state functional magnetic resonance imaging (rs-fMRI) data using network models is of increasing interest. It is often desirable to group nodes into clusters to interpret the communication patterns between nodes. In this study we consider three different nonparametric Bayesian models for node clustering in complex networks. In particular, we test their ability to predict unseen data and their ability to reproduce clustering across datasets. The three generative models considered are the Infinite Relational Model (IRM), Bayesian Community Detection (BCD), and the Infinite Diagonal Model (IDM). The models define probabilities of generating links within and between clusters and the difference between the models lies in the restrictions they impose upon the between-cluster link probabilities. IRM is the most flexible model with no restrictions on the probabilities of links between clusters. BCD restricts the between-cluster link probabilities to be strictly lower than within-cluster link probabilities to conform to the community structure typically seen in social networks. IDM only models a single between-cluster link probability, which can be interpreted as a background noise probability. These probabilistic models are compared against three other approaches for node clustering, namely Infomap, Louvain modularity, and hierarchical clustering. Using 3 different datasets comprising healthy volunteers' rs-fMRI we found that the BCD model was in general the most predictive and reproducible model. This suggests that rs-fMRI data exhibits community structure and furthermore points to the significance of modeling heterogeneous between-cluster link probabilities.

Introduction

Analysis of resting state functional magnetic resonance imaging (rs-fMRI) has emerged as a powerful research tool to study whole-brain functional connectivity. Since rs-fMRI provides information about intrinsic fluctuations in functional connectivity within and among brain networks, many conventional analysis schemes applied in task-related fMRI studies are irrelevant. Hence, a number of new techniques have been developed based on identification of stable spatio-temporal multivariate structure in the brain wide set of blood oxygen level dependent (BOLD) time series.

Using correlation methods or flexible multivariate techniques like independent component analysis (ICA) it has been shown that the BOLD signals of distant brain regions are coordinated suggesting interaction as they form so-called resting-state networks. The number and precise definition of these networks are controversial but several networks are broadly accepted, including the default mode network, motor network, visual network, fronto-parietal, and dorsal attention network (Damoiseaux et al., 2006). In addition to signals reflecting neuronal activity, the BOLD signal may be contaminated by physiological noise stemming from respiratory and cardiac cycles and head motion (Birn et al., 2006, Power et al., 2014).

Complex network analysis is a very active research field (Barabási, 2003) that has already found application in neuroimaging and in modeling resting state connectivity (Bullmore and Bassett, 2011, Sporns, 2011). The basic object is the ‘network graph’. When applied to neuroimage analysis the network graph is formed by brain regions represented as nodes. Nodes are connected by a link if brain regions are co-activated above a certain threshold. In rs-fMRI co-activation is often measured simply by calculating correlation between time series.

Network structure can be studied at many levels, from local motifs to global features like scale free link distributions signifying long-range coordination (van den Heuvel et al., 2008). Likewise, dense connections between high degree nodes are referred to as ‘rich club organization’ (van den Heuvel and Sporns, 2011). At the intermediate level we may identify clusters of highly linked nodes, i.e., high within-cluster link density and low link density to nodes in other clusters. By analogy to social networks such groups are referred to as communities. The presence of community structure in a network can be quantified by the global modularity index (Newman, 2006). Modularity can also be used to identify communities, i.e., by clustering nodes such that the modularity index is maximized (Lehmann and Hansen, 2007, Newman, 2006). Bassett et al. (2011) showed that ‘flexibility’, a measure for the number of cluster-assignment changes for nodes in a modularity optimized node-partition across time, is predictive for the amount of learning in a motor task in a subsequent session. Stevens et al. (2012) showed that modularity predicts visual working memory capacity, and Meunier et al. (2009) found that modularity is reduced during normal aging. Likewise, evidence is emerging that global modularity can be used as a bio-marker. For instance patients with childhood-onset schizophrenia have reduced modularity of their resting state networks (Alexander-Bloch et al., 2010). However, focusing on modularity as the single summary of a complex network may be overly simplistic as the modularity measure does not account for variability in the inter-linking relations between functional clusters. Hence, modularity driven clustering might not reveal all salient aspects of community structure in a network. Indeed, modularity has been criticized for its lack of flexibility as a measure of community structure (Fortunato and Barthélemy, 2007).

A better understanding of this important mid-level structure in brain networks requires methods that can capture more informative representations of community structure. For this we turn to a family of expressive generative network models. Relational models are statistical generalizations of graph clustering that not only consider the within-cluster density but also take the specific relations between clusters into consideration. The Infinite Relational Model (IRM) (Kemp et al., 1999, Xu et al., 2006) is a non-parametric generalization of the stochastic block model (Nowicki and Snijders, 2001) for inference of such generalized group structure in complex networks. As the IRM representation considers both linking within and between groups, a highly inter-linked group of nodes could in fact be clustered in different groups if they link in different ways to other clusters, i.e., the IRM can infer more general group structures beyond the conventional community structure. An additional feature of the IRM type of model is that it conveniently allows for analysis of multi-graph networks, which for neuroimaging data could be graphs from multiple sessions or subjects. For multi-subject analysis one could look for a common node clustering structure over subjects but allow individual subject cluster linking densities (Mørup et al., 2010) or test the hypothesis that both clustering and link structure are shared between all subjects (Andersen et al., 2012b).

A constrained variant of the IRM representing the community structure of graphs in the sense of grouping highly connected node sets was proposed recently by Mørup and Schmidt (2012). The Bayesian Community Detection (BCD) scheme restricts the between-cluster link densities to be strictly lower than within-cluster link densities, thus constraining the more general IRM to conform with the notion of a community in a social network. Another constraint is introduced by the so-called Infinite Diagonal Model (IDM) (Mørup and Schmidt, 2012, Schmidt and Mørup, 2013). The IDM allows for differential within-cluster link densities but models only a single between-cluster density and as such the variability in the link densities between clusters is neglected when inferring the clustering structure. Since the between-cluster link density is shared across clusters, it can be thought of as a background-noise density.

It should be noted that certain metrical properties can be expected when basing the graph on simple time series correlation, thereby assuming stationarity. If node A is highly correlated with node B, and B is highly correlated with C, then there is a lower limit on the correlation between nodes A and C which can be inferred by the triangle inequality (Zalesky et al., 2012). This bound will support the formation of community structure, as in social relations: ‘Friends of friends are friends’, however, we also note that by thresholding the correlation, the impact on the community structure of these geometrical constraints is non-trivial.

Spatial grouping of brain regions by similarity of BOLD time series as pursued in the present work can be seen as complementary to classical approaches to spatial grouping such as time series clustering (Goutte et al., 1999) and independent component analysis (ICA) (McKeown et al., 1998, McKeown et al., 2003). Compared with conventional clustering, the relational modeling approach has the advantage that clusters are formed by considering the connectivity patterns both within and between clusters, and furthermore relational models avoid the formation of a group prototype, hence allow for more flexible group structures to be found (Kemp et al., 1999). The use of ICA is based on assumptions of independence either in spatial or temporal dimensions, which can be questioned in the resting state as it has been observed that components are negatively correlated in time and have extensive overlaps in space (Fox et al., 2005).

In this study, we apply the above-mentioned community detection schemes to rs-fMRI data acquired in three cohorts of healthy volunteers and investigate to which degree functional brain networks as measured by rs-fMRI exhibit community structure. The three Bayesian relational methods, i.e. IRM, BCD, and IDM, for inference of group structure in complex networks differ only in the way the link probabilities between clusters are modeled. The rich link structures of the relational models can be seen as a way of inferring functional integration at the inter-community level as discussed in Hagmann et al. (2008) and Sporns (2013). We evaluate the performance of these models with respect to their ability to predict out-of-sample data (predictability) and the robustness of their clustering under re-sampling of data (reproducibility) using the NPAIRS split-half framework (Strother et al., 2002). The evaluation is carried out on three datasets from different sites and the models are evaluated both within and between sites for several thresholds of the correlation matrices. In addition, we compare the three models with three other methods for grouping nodes into clusters, namely Infomap, Louvain modularity, and hierarchical clustering. The work in this paper builds on work presented in Andersen et al. (2012b).

Section snippets

Methods

For generality we investigate three rs-fMRI datasets. One dataset acquired locally at the Danish Research Centre for Magnetic Resonance (Copenhagen) and two other rs-fMRI datasets publicly available in the FCON1000 database (Biswal et al., 2010) (viz., the ‘Beijing’ and the ‘Leipzig’ datasets).

Estimated clusters

We thresholded the graphs to maintain the top 8% correlations. The threshold corresponds to a mean (std) p-value across subjects of 4.75  10 5(1.80  10 4). The reproducibility between solutions found with different restarts was measured as the NMI between samples with the highest value of the posterior distribution for each run. This was done within all three methods and between the methods and results are shown in Table 1 along with the number of clusters estimated by each of the methods. For

Discussion and conclusion

Our aim was to explore statistical models for finding structure in networks at the intermediate level. Accumulated evidence points to the importance of community structure in brain networks, hence, we tested three statistical link models, which differed in terms of the different restrictions that were imposed on how nodes are clustered. The IRM is a very flexible representation for graph clustering, in which nodes can be grouped together without having a high link density among them. The BCD is

Acknowledgment

This work is funded by a project grant from the Lundbeck Foundation to Hartwig Siebner (grant-nr R48 A4846). The Magnetom Trio MR scanner was donated by the Simon Spies Foundation.

References (52)

  • J.D. Power et al.

    Methods to detect, characterize, and remove motion artifact in resting state fMRI

    NeuroImage

    (2014)
  • M. Rubinov et al.

    Complex network measures of brain connectivity: uses and interpretations

    NeuroImage

    (2010)
  • A.M. Smith et al.

    Investigation of low frequency drift in fMRI signal

    Neuroimage

    (1999)
  • O. Sporns

    Network attributes for segregation and integration in the human brain

    Curr. Opin. Neurobiol.

    (2013)
  • S.C. Strother et al.

    The quantitative evaluation of functional neuroimaging experiments: the NPAIRS data analysis framework

    NeuroImage

    (2002)
  • N. Tzourio-Mazoyer et al.

    Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain

    Neuroimage

    (2002)
  • M.P. van den Heuvel et al.

    Small-world and scale-free organization of voxel-based resting-state functional connectivity in the human brain

    NeuroImage

    (2008)
  • G. Varoquaux et al.

    Learning and comparing functional connectomes across subjects

    NeuroImage

    (2013)
  • A. Zalesky et al.

    On the use of correlation as a measure of network connectivity

    Neuroimage

    (2012)
  • D.J. Aldous

    Exchangeability and related topics

    (1985)
  • A.F. Alexander-Bloch et al.

    Disrupted modularity and local connectivity of brain functional networks in childhood-onset schizophrenia

    Front. Syst. Neurosci.

    (2010)
  • K.W. Andersen et al.

    Identification of functional clusters in the striatum using infinite relational modeling

  • K.W. Andersen et al.

    Identifying modular relations in complex brain networks

  • A.L. Barabási

    Linked: How Everything is Connected to Everything Else and What It Means for Business, Science, and Everyday Life

    (2003)
  • D.S. Bassett et al.

    Dynamic reconfiguration of human brain networks during learning

    Proc. Natl. Acad. Sci. U. S. A.

    (2011)
  • B.B. Biswal et al.

    Toward discovery science of human brain function

    Proc. Natl. Acad. Sci. U. S. A.

    (2010)
  • Cited by (16)

    • Using connectomics for predictive assessment of brain parcellations

      2021, NeuroImage
      Citation Excerpt :

      In contrast to cluster validity measures that quantify similarity within the cluster to the similarity of other clusters Arslan et al. (2018); Dornas and Braun (2018) or statistical prediction based on feature maps or time-series Thirion et al. (2014); Wang et al. (2018), our proposed approach quantifies how well the network organization is preserved within the representation induced by a parcellation. In particular, our statistical prediction framework poses quantification of parcellation quality as a link-prediction problem Ambrosen et al. (2014); Andersen et al. (2014); Clauset et al. (2008); Liben-Nowell and Kleinberg (2007). A parcellation is thereby assessed by its ability to characterize brain connectivity data that may derive from an orthogonal modality to that from which the parcellation was defined.

    • Inference in the age of big data: Future perspectives on neuroscience

      2017, NeuroImage
      Citation Excerpt :

      We therefore believe that the strength of flexible non-parametric models to automatically adjust the number of model parameters will probably turn out to be a crucial property of statistical models used in data-rich neuroscience. Although non-parametric models have been used in neuroimaging (e.g., Lashkari et al., 2012; Andersen et al., 2014), parametric models are today the predominant approach in neuroscience. Many big-sample studies (i.e., data from hundreds of animals or humans) currently apply the same parametric models as previous small-sample studies (i.e., a few dozen animals or humans).

    View all citing articles on Scopus
    View full text