1 Introduction

Patients with pharmacoresistant epilepsy can be treated surgically, but first, it is necessary to identify with high precision the anatomical region of the cerebral cortex where the anomaly that triggers epileptogenic crises is located. Among the most common abnormalities are Focal Cortical Dysplasias (FCDs). FCDs produce a disorder in the electrical functioning of the brain and they can be caused by genetic factors, lack of oxygen during brain development, parasites, among others [1]. Currently, in order to locate these abnormalities, invasive electrophysiological tests are performed, but they generate discomfort in patients. In this sense, it is necessary to develop a non-invasive FCD detection tools.

Recently, machine learning methods that recognize morphological/intensity patterns associated with dysplasias from magnetic resonance images (MRIs) [2,3,4] have been proposed to support the non-invasive diagnosis of patients with pharmacoresistant epilepsy. However, the automatic detection of FCDs by machine learning algorithms is hindered by the class imbalance problem in their input data. Such imbalance is due to the fact that there are substantially fewer lesional vertex than healthy ones (points of brain surface). The fundamental issue with the imbalanced learning problem is the ability of imbalanced data to significantly compromise the performance of most standard learning algorithms, which assume or expect balanced class distributions or equal misclassification costs. Therefore, when they are presented with complex imbalanced data sets, the result can be biased classifiers towards the majority class [5].

To address this issue, in [3] a random bagging approach is used and a set of “base-level ” classifiers is constructed, each trained using logistic regression, by an iterative-reweighted least squares (IRLS) algorithm, identifying 14 out of 24 FCD lesions (58%), nevertheless this methodology is computationally expensive because the classifier is trained many times [5]. Hong et al. in [4] propose randomly select non-lesional vertex, balancing their number in both classes, implementing a linear discriminant analysis and providing a detection rate of 74% in adult cohorts. On the contrary, Adler et al. in [2] do not consider the class imbalance problem, they implement a neural network using novel surface-based features obtaining a sensitivity of 73%.

In this paper we propose a novel methodology for automatic detection of FCD. We use in our experiments a public dataset that contains cerebral surface morphological data of 22 subjects whit histological confirmed FCD. In first place, we use a cluster-based under-sampling method to select the most representative samples to train several neural networks, building a majority vote classifier and achieving both high sensitivity (95.11%) and specificity (93.2%) results.

Our paper contains: a brief description of several morphological and intensity features of brain cortex (Sect. 2.1), description of cluster-based under-sampling approach (Sect. 2.2) and bagging approach (Sect. 2.3), the experimental setup (Sect. 3), results and discussion (Sect. 4) and finally, conclusions (Sect. 5).

2 Methods

2.1 MRI Feature Estimation for Automatic Detection of Dysplasias

Let \(\mathbf {V} \in \mathbb {R}^{W\times H \times L}\) be an MRI over which is generated a model of the GM-WM (gray matter - white matter) and GM-CSF (gray matter - cerebrospinal fluid) surfaces using the algorithm proposed in [6]. Once these surfaces are reconstructed, it can be measured several cortical features such as cortical thickness, gray-white matter intensity contrast, curvature, and sulcal depth and intensity at each vertex of the 3D cortical reconstruction. Thickness cortex is calculated as the average minimum distance between each vertex on the pial and white matter surfaces. Gray-white matter intensity contrast is calculated as the ratio of the gray matter to the white matter signal intensity. The gray matter signal intensity is sampled at a distance of 30% of the cortical thickness above the gray-white matter boundary. The white matter signal intensity is sampled 1 mm below the gray-white matter boundary. FLAIR intensity is sampled at the gray-white matter boundary as well as at 25%, 50% and 75% depths of the cortical thickness and at −0.5 mm and −1 mm below the gray-white matter boundary. Mean curvature is measured at the gray-white matter boundary as \(\frac{1}{r}\), where r is the radius of an inscribed circle and is equal to the mean of the principal curvatures k1 and k2 [7]. Other novel features have been recently introduced in [2] to identify local changes in thickness/intensity on MRI surface reconstructions which are insensitivity to motion artifact.

Finally, we can obtain an array \(\mathbf {X} \in \mathbb {R}^{n\times p}\), where n is the number of samples in the dataset and each vertex is represented as \(\mathbf {x}_i \in \mathbb {R}^p\) characterized by p features, which can be labeled as healthy vertex (0) or lesional vertex (1), creating a target vector \(\mathbf {t}= \left\{ t_{i} \in [0,1] :\, i=1,\ldots ,n\right\} \).

2.2 Relevant Sampling for Imbalance Problem in Dysplasias Recognition

Many datasets in real applications involve imbalanced class distribution problem. This is the case of detection of focal cortical dysplasias since there are substantially fewer vertex labeled as lesional than vertex labeled as non-lesional.

In supervised classification, we are given an observed training set \(\mathbf {X} \in \mathbb {R}^{n\times p}\) over which a predictive model \(\mathcal {C}\) is to be induced. In imbalanced data distributions, the majority class \(\mathbf {X}_{maj}\), outnumber the minority class \(\mathbf {X}_{min}\). Representing \(|\mathbf {X}_{min}|\) as the number of samples in the minority class, and \(|\mathbf {X}_{maj}|\) as the number of samples in the majority class, where \(|\mathbf {X}_{min}|<|\mathbf {X}_{maj}|\), the use of under-sampling methods in imbalanced learning applications consists of the modification of an imbalanced dataset by some mechanisms in order to provide a balanced distribution [5], which means that \(|\mathbf {X}_{min}| = |\mathbf {X^*}_{maj}|\) where \(\mathbf {X^*}_{maj} \) is the under-sampled majority class set. Hence, it is important to select the suitable training data for classification in the imbalanced class distribution problem due to the training data significantly influence the classification accuracy [8].

We address this issue by using a Cluster-Based Under-Sampling technique (CBUS) to select the most representative data to train the classifier [8]. To find the most relevant samples, the first step is to partition the dataset in several clusters by using K-means algorithm. The main objective is to find an assignment of data points to clusters, as well as a set of means \(\left\{ \pmb {\mu }_k\in \mathbb {R}^{p}:\, k=1,\ldots ,K\right\} \), such that the sum of the squares of the euclidean distances of each data point to its closest vector \(\pmb {\mu }_{k}\), is a minimum. This can be achieved by minimizing an objective function given by

$$\begin{aligned} J=\sum \limits _{i=1}^n \sum \limits _{k=1}^K r_{nk}||\mathbf {x}_i - \pmb {\mu }_k||^2, \end{aligned}$$
(1)

where

$$\begin{aligned} r_{nk}=\left\{ \begin{array}{ll} 1 \quad if \; k=\underset{j}{\mathrm {argmin}}||\mathbf {x}_i - \pmb {\mu }_j||^2\\ 0 \quad otherwise \end{array} \right. \end{aligned}$$
(2)

Once the data have been clustered, a suitable number of majority class samples is randomly selected from the \(k\text {th}\) cluster, by considering the ratio of the number of majority class samples to the number of minority class samples in each cluster as follows:

$$\begin{aligned} |\mathbf {X^*}_{maj}^{(k)}|=(r\times |\mathbf {X}_{min}|)\times \frac{|\mathbf {X}_{maj}^{(k)}|/|\mathbf {X}_{min}^{(k)}|}{\sum \limits _{i=1}^{K}|\mathbf {X}_{maj}^{(k)}|/|\mathbf {X}_{min}^{(k)}|}, \end{aligned}$$
(3)

where r is the expected ratio of \(|\mathbf {X^*}_{maj}|\) to \(|\mathbf {X}_{min}|\) (generally set to 1), building an under-sampled matrix \(\mathbf {X^*}= \bigcup \limits _{k=1}^{K}\mathbf {X^*}_{maj}^{(k)}\cup \mathbf {X}_{min}\).

2.3 Bagging Approach

To enhance the performance of classifiers in the imbalanced data distribution problem, a technique known as bagging is often used. The bagging methods consist in construct a set of N(odd) classifiers \(\mathcal {C}_1, \mathcal {C}_2,\ldots , \mathcal {C}_N\), each trained on all the minority class instances and N equal-sized samples of random majority class instances. When a new instance \(\mathbf {x}_{new}\) is to be classified, each trained classifier makes a prediction, and the final prediction \(\hat{t}\) is taken as the majority vote [9].

Here we propose a novel strategy to implement a bagging approach in order to train N models, but using the most relevant majority class samples: First, the database is cluster-based under-sampled according to Eq. 3, by establishing the expected imbalance ratio r as the number N of models to train, in order to obtain r equal sized classes with the most representative majority class instances, then training N classifiers. This new approach can be summarized in Fig. 1.

Fig. 1.
figure 1

Novel strategy to address the class imbalance, oriented to automatic detection of FCDs.

3 Experimental Setup

Our purpose is to assess the predictive performance of the CBUS technique combined with an ensemble method known as bagging, as an alternative to address the class imbalance for the automatic recognition of FCDs.

The proposed method is tested in a public FCDs dataset used in [2].Footnote 1 Due to bioethical issues the original MRIs are not shared publicly, even though a matrix which contains cerebral surface morphological data of 22 patients with FCD is published, which is represented as \(\mathbf {\Psi }=\left\{ \mathbf {X}^{(m)}\in \mathbb {R}^{n^{(m)}\times p}: m=1,\ldots ,22\right\} \), where the number of features is \(p=28\). The number of vertex in each brain surface reconstruction varies from patient to patient, but the total number of vertex (samples) is 3307529.

FreeSurfer software v5.3 [6] was used to generate cortical reconstructions and to coregister FLAIR scans to T1-weighted images. Then it was employed to compute several morphological/intensity features, as reported in Sect. 2.1. Manual lesion masks were created for the 22 participants, on axial slices of the volumetric scan and they are then registered onto the cortical surface reconstructions (see Fig. 2), in order to assign the labels to each sample.

Fig. 2.
figure 2

(Images taken from [2]).

Lesions are identified combining information from T1 images, previous radiological reports and reports from multi-disciplinary team meetings, after, the lesion mask are created on (a) MRI and on (b) surface cortex reconstruction.

The features included in the dataset are cortical thickness, gray-white matter intensity contrast, sulcal depth, mean curvature, FLAIR intensity samples at six cortical depths, local cortical deformation, “dough-nut” thickness, “dough-nut” intensity contrast,“dough-nut” FLAIR intensity at different cortical depths. Besides, a set of normalized inter-hemispheric asymmetry measures for cortical thickness, gray-white matter intensity contrast, the FLAIR intensity samples and local cortical deformation. This dataset exhibit a significant imbalance on the order of 42:1.

With regard to class imbalance, we propose two techniques: CBUS, and a combination of CBUS and bagging as mentioned in Sect. 2.3. To implement CBUS the data is clustered into \(K=4\) groups. When CBUS is implemented alone the expected imbalance ratio is set to 1 (\(r=1\)), balancing at order of 1:1. Then, CBUS is implemented for each patient, and the under-sampled data is concatenated resulting in a sub-sampled input matrix \(\mathbf {X^*}\) with \(n=151340\) samples, \(p=28\) features, and a target vector \( \mathbf {t} = \left\{ t_{i} \in \left[ 0,1\right] : i =1\ldots ,151340\right\} \), both with the same number of samples of the majority and minority classes. To combine CBUS and bagging, five classifiers (\(r=N=5\)) are trained on all the lesional vertex and on five equal-sized samples of the most relevant non-lesional vertex.

As benchmark, we compare our results with the methodologies to address the class imbalance oriented to automatically detection of FCD proposed in [2,3,4]: without under-sampling (WUS), random under-sampling (RUS) and bagging approach on five random samples of majority class, each one of equal size of the minority class. For all schemes, Neural Network classifiers (NN) are trained using surface based measures from samples (vertex) selected by each method. A single hidden layer neural network is chosen as the classifier because it can be rapidly trained on large datasets [2]. The number of hidden neurons in the network is determined through a principal component analysis applied to the input features, using the number of components that explained over 99% of the variance. In our case, 11 principal components are required. Finally, we employ a 10-fold cross-validation analysis to test the performance of our methods. G-mean, sensitivity and specificity of the classifiers outputs are computed because these measures give us a class-by-class performance estimate. ROC curves are also employed to assess the performance of each approach. Finally, the overall results of each method are compared.

4 Results and Discussion

Table 1 shows the results obtained for proposed methods; WUS, RUS, random bagging, CBUS and the combination of CBUS and bagging. Without under-sampling the classifier achieves a high specificity (99.7%), nevertheless the sensitivity is low (49.8%). This happens because the classifier is biased towards the healthy class due to the high class imbalance. When input data is randomly under-sampled the sensitivity results improve, but the specificity decreases. However, the G-mean value, which encompasses sensitivity and specificity information, increases significantly. The bagging approach improves the G-mean value, with respect to the two previews methods. For CBUS this figure is even higher, this is explained because this technique selects the most relevant samples from clusters where data is concentrated. The combination of CBUS and bagging achieves further improvement in the G-mean and sensitivity results. Figure 3, presents the ROC curves for the five proposed methods. The ROC curves show that our approach performed better than the remaining methods.

Table 1. Methods results comparison
Fig. 3.
figure 3

ROC curves obtained for different methods studied

The study developed in [2] does not consider the class imbalance, however, they post-process the output probability maps from the classifier and cluster into neighbor-connected vertex, additionally, they discard the smallest clusters. Their automated lesion detection method is considered successful if this cluster overlapped the lesion mask. Under this criterion, their method was able to detect 16 out of 22 FCD (73%). Unfortunately, we do not have the original MRIs, so we can not validate our method using an analogous strategy. However, the results at the vertex level show that our methods outperform the state-of-the-art approaches that address the class imbalance for the automatic detection of FCDs. It is important to note, that although our method has a satisfactory performance, it presents limitations regarding its computational cost, due to the necessity of training 5 neural networks.

5 Conclusions

In this study, we propose a novel strategy to address the class imbalance oriented to the automatic detection of focal cortical dysplasias. Our method is based on measures of cerebral surface morphological/intensity features and uses Cluster-Based Under-Sampling combined with a bagging process. Then, the relevant samples are employed to train neural networks classifiers. The method was tested on a matrix of FCDs data from an online repository. The results at the vertex level show G-mean (94.15%), sensitivity (95.11%), and specificity (93.2%) values. So, our approach outperforms comparable methods that address the class imbalance in the detection of FCDs.

As future work, authors plan to test the introduced method based on a cluster-wise validation from MRIs. Besides, support vector machines classifiers will be coupled with our clustering-based and bagging strategy to reveal relevant samples using reproducing kernel Hilbert spaces.