Keywords

1 Introduction

Brain evolution has been an intriguing research topic for centuries. A comparative structural connection study among primate brains may help in our understanding of the structural substrates underlying the development of higher cognitive functions [1]. Recent research indicates that the organization of white matter (WM) bundles has been preserved between macaques and humans while structural difference has also been identified [2, 3]. However, these studies mainly focus on one specific fasciculus, e.g. arcuate fasciculus, or one specific brain region, e.g. dorsal prefrontal lobe [1, 3]. The similarity and difference between the two species in terms of global connective patterns are still largely unknown. The lack of such knowledge partly issues from the methodology used to analyze the connective anatomy.

Diffusion MRI (dMRI) and tractography approaches have given us the opportunity to study the whole brain large-scale connectome in primate brains in vivo [1]. Recent comparative dMRI studies suggest that it is a powerful approach in revealing interesting structural connectivity patterns of brain evolution at a global scale. Taking the advantages of this imaging technique, we proposed a novel framework in this paper to identify the globally preserved structural connectome between human and macaque and the one exclusive to one species in a data-driven manner.

Generally, the comparative study is performing a common brain parcellation scheme, such as Brodmann areas in this paper used as a test bed. Currently, only common Brodmann brain sites between the species are used to create the global connectome. DMRI derived connective strength is used as a matching criterion. To study the similarity and difference between the species, an improved sparse CCA algorithm [5] was applied to those connective matrices to produce optimized weight vectors for the connective features so that components strongly correlated between the two species are yield. Sparse CCA was adopted to avoid the limit of conventional CCA, as the feature dimension is far larger than subject numbers and not all of them are of interest and importance. By analyzing the weights, we identified the preserved connections and the corresponding white matter fibers. Those exclusive to a specific species was also analyzed. The effectiveness of the framework has been evaluated by cross-validation. The identified connections and fibers are consistent with the reports in the literatures, demonstrating the effectiveness and promise of this framework.

2 Materials and Methods

Generally, as illustrated in Fig. 1, we used T1-weighted MRI and dMRI data to construct structural connectivity matrices for each species. Then, each matrix was stretched to a feature vector. Those feature vectors for a species compose a feature matrix. Next, an improved SCCA algorithm [5] was adopted. Currently, only the canonical components with the strongest correlation between the two feature matrices were considered by this SCCA algorithm [5]. Consequently, two weight vectors u and v were yield for the element of the connectivity matrices, and they can be restored to the matrix format, U and V. By jointly analyzing U and V, we determined the strongly correlated connectivities conserved between the two species, and the corresponding dMRI derived fibers were extracted and were suggested to be the preserved fibers. Connectivities and fibers exclusive to a specific species were also analyzed.

Fig. 1.
figure 1

The flowchart of the framework.

2.1 Data and Preprocessing

Human Brain Imaging.

Ten randomly selected human brains from the Q1 release of WU-Minn Human Connectome Project (HCP) consortium [6] were used in this study. The T1-weighted structural MRI had voxels with 0.7 mm isotropic, three-dimensional acquisition, T1 = 1000 ms, TR = 2400 ms, TE = 2.14 ms, flip angle = 8°, image matrix = 260 × 311 × 260. DMRI was acquired in following parameters: spin-echo EPI sequence; TR = 5520 ms; TE = 89.5 ms; flip angle = 78°; refocusing flip angle = 160°; FOV = 210 × 180; matrix = 168 × 144; spatial resolution = 1.25 mm × 1.25 mm × 1.25 mm; echo spacing = 0.78 ms. Particularly, a full dMRI session includes 6 runs, representing 3 different gradient tables, with each table acquired once with right-to-left and left-to-right phase encoding polarities, respectively. Each gradient table includes approximately 90 diffusion weighting directions plus 6 b = 0 acquisitions interspersed throughout each run. Diffusion weighted data consisted of 3 shells of b = 1000, 2000, and 3000 s/mm2 interspersed with an approximately equal number of acquisitions on each shell within each run. Ten randomly selected subjects were used in this study.

Macaque Brain Imaging.

UNC-Wisconsin neurodevelopment rhesus MRI database (http://www.nitrc.org/projects/uncuw_macdevmri/) was used in this work, consisting of T1 weighted MRI and dMRI data. This is a longitudinal database and we only used the scans of 10 different subjects when they are more than 18 months old. The released T1 weighted MRI data has been registered to the UNC Primate Brain Atlas space [4]. The resolution of this space is 0.27 × 0.27 × 0.27 mm3 and a matrix of 300 × 350 × 250. The basic parameters for diffusion data acquisition were: resolution of 0.65 × 0.65 × 1.3 mm3, a matrix of 256 × 256 × 58, diffusion-weighting gradients applied in 120 directions and b value of 1000 s/mm2. Ten images without diffusion weighting (b = 0 s/mm2) were also acquired.

Preprocessing.

Preprocessing steps on T1 weighted MRI included brain skull removal and tissue segmentation via FSL [7]. T1 weighted MRI data was nonlinearly warped to b0 map of dMRI data via FSL-fnirt [8], before cortical surface reconstruction was performed to reconstruct inner cortical surface of white matter (WM) [9]. For the dMRI data, skull-strip and eddy currents were applied firstly, then BedpostX in FSL 5 (http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FDT/UserGuide#BEDPOSTX) was adopted to estimate the axonal orientations for each voxel. In this paper, we used two axonal orientations, because it was suggested that b-values upwards of 4000 would be required to resolve a 3 fiber orthogonal system robustly [10]. For the sake of convenience in visualization, DSIstudio [11] was used to reconstructed deterministic fibers from BedpostX derived axon orientations. 5 × 104 Fiber tracts were reconstructed for each subject. FA and angular threshold are 0.1 and 60°. Small FA value for primate brain was suggested in [15].

Structural Connectivity Matrix Construction.

The structural connectivity matrices were constructed from dMRI derived fibers and white matter surfaces with a parcellation scheme. Currently, we used Brodmann areas parcellation scheme as a test bed to develop and evaluate our framework. All macaque white matter surfaces were warped to the ‘F99’ macaque atlas space [13] via spherical registration method [12]. The Brodmann parcellation in the atlas space was mapped back to the surface of each individual. The Brodmann areas in the ‘Conte69’ human atlas [14] were mapped back to each human subject’s surface using the same approach. Currently, ipsilateral structural connectivities were considered, and we used Ms and Hs to denote the connective matrices of macaque and human, respectively. Currently, 28 Brodmann areas shared by the two atlases and robustly warped to individuals were used as nodes for the matrices. So the connective matrices Ms and Hs are in the same size. For each individual, the element of the matrix, such as \( m_{i,j} \) in M, was defined as the connective strength between Brodmann area i and j. That is the number of fiber tracts connecting area i and j, divided by the total fiber numbers.

2.2 Sparse Canonical Correlation Analysis (SCCA)

The aim is to identify the common and different connections in a group-wise and global manner. We used M or H as a connective feature for each subject. Because M and H are symmetrical matrices, the upper triangle parts were extracted and converted to feature vectors x and y, x is associate with M and y is associate with H. The problem was selection of features such that the similarity between the two groups, \( X = \{ x^{1} ; \cdots x^{n} \} \) and \( Y = \{ y^{1} ; \cdots y^{n} \} \) reached the maximum. To this end, we adopted a sparse CCA algorithm [5], because the conventional CCA without sparsity penalty yields too many non-zero values, making the connective matrices noisy. Also, small sample size results in all correlation values nearly equal to 1. In the adopted SCCA [5], a more robust penalty (absolute value based GraphNet, AGN) than the existing methods [16] is added. The goal is to find a pair of weights, denotes as u and v, where the correlation coefficient between uX and Yv is the highest. u and v are the canonical loadings of X and Y, respectively. The AGN-SCCA model is formally defined as

$$ \mathop {\hbox{min} }\limits_{u, v} - u^{T} X^{T} Yv $$
(1)
$$ s.t.\;\left\| {Xu} \right\|_{2}^{2} \le 1,\;\left\| {Yv } \right\|_{2}^{2} \le 1,\;\left\| u \right\|_{AGN} \le c_{1} ,\;\left\| v \right\| _{AGN} \le c_{2} $$
(2)

where \( \left\| u \right\|_{AGN} \le c_{1} \) and \( \left\| v \right\|_{AGN} \le c_{2} \) are defined as follows,

$$ \begin{array}{*{20}c} {\left\| u \right\|_{AGN} = \lambda_{1} \left| u \right|^{T} L_{1} \left| u \right| + \beta_{1} \left\| u \right\|_{1} } \\ {\left\| v \right\|_{AGN} = \lambda_{2} \left| v \right|^{T} L_{2} \left| v \right| + \beta_{2} \left\| v \right\|_{1} } \\ \end{array} $$
(3)

The L 1 and L 2 are the corresponding Laplacian matrices of the correlation matrices of X and Y separately. The AGN-SCCA method cannot only discovers a high relationship between X and Y, but also recovers the structure information from X and Y respectively. That is, it can find out the correlated features in X. The lasso terms in both penalties assure the sparsity.

Generally, we solve AGN-SCCA problem by two alternative iterations procedure which is briefly described as follows (Please see [5] for details).

Algorithm 1.

Compute the first pair of canonical loadings of X and Y.

  1. 1.

    Initialize u and v;

  2. 2.

    Iterate until convergence:

    1. (a)

      \( u \leftarrow {\text{argmin}}_{u} u^{T} X^{T} Yv \), \( s.t. \;\left\| u \right\|_{AGN} \le c_{1} \) and \( \left\| {Xu} \right\|_{2}^{2} \le 1 \);

    2. (b)

      \( v \leftarrow {\text{argmin}}_{v} u^{T} X^{T} Yv \), \( s.t. \;\left\| v \right\|_{AGN} \le c_{2} \) and \( \left\| {Yv} \right\|_{2}^{2} \le 1 \).

  3. 3.

    Report u and v.

Indeed, the weight vectors u and v have the same size of the stretched X and Y. They can be transformed backwards to the matrix form, denoted by U and V (only the upper triangle part can be retrieved. The lower part is just a symmetrical copy of the upper one). The values in the two weight matrices suggest the contribution of the corresponding structural connectivities to maximizing the correlation (similarity) of the global connective patterns between two species. Positive \( u_{ij} \) and \( v_{ij} \) indicate that those connectivities in human and macaque are positively correlated, which further suggesting that those connectivities might be those conserved between species.

3 Results

3.1 Cross-Validation

We used 10 human and macaque subjects in this study. Because only ipsilateral connection is considered in this work and we assume that contralateral Brodmann areas with same label have the same function, we have n = 20 samples for each species. A five-fold cross validation scheme was adopted to evaluate if the framework yield consistent U and V and effectiveness of AGN-SCCA algorithm to produce highly correlation coefficient. Specifically, 16 pairs of human and macaque samples were used as ‘training’ samples to tune the parameters in AGN-SCCA algorithm till the obtained weight matrices U and V applied to the remaining 4 ‘testing’ sample pairs yield the highest correlation coefficient. In Fig. 2, we show the five optimized U and V pairs. The consistency demonstrates the weight matrix yield by the framework is robust to variance introduced by subjects. The averaged correlation coefficient values on the five-fold test are 0.95 ± 0.01 (illustrated by the scatter chart in Fig. 3(a) on the five-fold cross-validation) compared to the average intra-species values for human and macaque, 0.995 and 0.989, demonstrating the effectiveness of the algorithm and suggesting common global connective patterns between human and macaque do exist.

Fig. 2.
figure 2

The five weight matrices Us and Vs yield by five-fold cross-validation. The standard errors of the five Us and Vs are shown on the most right side.

Fig. 3.
figure 3

(a) The correlation of 1st canonical component between human and macaque. The scatter chart in the top-right corner is the positive correlation between human subjects and macaque subjects in the transformed space. The results of the five-fold cross-validation are shown in different colors. The weight vectors have been transformed back to matrices the averaged weight matrices of the five-fold cross-validation are shown beside the axes. Only the most positive weights \( u_{ij} \)s and \( v_{ij} \)s (above the standard deviation) are shown; (b) and (c) show the fibers of a human subject and a macaque subject corresponding to the positive weights in (a).

3.2 DTI Tracts Comparison Between Human and Macaque

The averaged \( \bar{U} \) and \( \bar{V} \) obtained from the five-fold cross-validation results in Fig. 2 were used to analysis the global connective patterns between human and macaque. In Fig. 3(a), we show the \( u_{ij} \)s and \( v_{ij} \)s whose positive value is greater than the standard deviation (0.04 for \( \bar{U} \) and 0.1 for \( \bar{V} \)). Globally, the structural connectivities corresponding to the positive \( u_{ij} \)s and \( v_{ij} \)s are the strongest contributors to produce the highly correlated canonical component between two species. Therefore, those connectivities are suggested to be the species preserved ones. Figure 3(b) and (c) show those preserved fibers on a human subject and a macaque subject. It is also noticed in Fig. 3(a) that negative \( u_{ij} \)s for human connectivities can be found, suggesting that those human connectivities together have an opposite trend with the combination of most macaque connectivities. This result may suggest that those connectivities are exclusive to human. Because the connectivity matrices of the two species share the same parcellation scheme, we can therefore further analysis the specific connectivities by overlapping the two positive weight matrices (no threshold applied). It is observed in Fig. 4(a) that the overlapped connectivities matrix exhibits three clusters including two clusters on the diagonal line and one off line cluster. Cluster #1 is located on somatosensory and motor cortices (BA1–7). The other diagonal one (Cluster #2) resides on visual cortices (BA17–19) and temporal lobe (BA20–22). The off line cluster (Cluster #3) consists of the Fronto-occipital stream (BA9, 10 to BA17–19) and Fronto-temporal stream (BA9, 10 to BA20–22). The fibers of the three clusters on a human subject and a macaque subject are shown in Fig. 4(b–d). Those structural connectivities have been reported to be preserved in human and macaque in many available works [1, 3]. On the other hand, connectivity differences can be derived by overlapping the negative and positive matrix in Fig. 3(a). Because only one negative matrix was produced for human subject, it can be directly used to identify the connectivity difference, e.g., the white arrow highlighted connectivities (Fig. 4(e)) linking Inferior front lobe to temporal lobe (see Fig. 4(f) for the corresponding fibers). This absent frontal projection to the middle and inferior temporal gyrus in macaque has been validated by literature reports [3].

Fig. 4.
figure 4

(a) The overlapping of the two species’ positive weight matrices; (b)–(d) the fibers derived from the 3 clusters in (a); (e) the connectivity difference matrix between the two species; (f) the fibers on a human subject derived from the arrow-highlighted connectivities in (e).

4 Conclusion

In this work, we used dMRI to estimate the whole brain large-scale white matter pathways and Brodmann areas as a test bed to construct a global connectome for human and macaque, on which AGN-SCCA algorithm was adopted to yield the weights associated with the connectivity to produce the component strongly correlated between the two species. By analyzing the weights we identified the preserved white matter pathways and those exclusive to a specific species. The results are consistent with the reports in the literatures, demonstrating the effectiveness and promise of this framework.