Ultrasonographic feature selection and pattern classification for cervical lymph nodes using support vector machines

https://doi.org/10.1016/j.cmpb.2007.07.008Get rights and content

Abstract

A rough margin based support vector machine (RMSVM) classifier was proposed to improve the accuracy of ultrasound diagnoses for cervical lymph nodes. Thirty-six features belonging to 10 kinds of ultrasonographic characteristics were extracted for each of 110 lymph nodes in ultrasonograms. Comparison studies were done for three classifiers—the classical support vector machine (SVM), the general regression neural network and the proposed RMSVM, with or without the feature selection by the recursive feature elimination (RFE) algorithm, respectively, based on SVMs and the mean square error discriminant. It was indicated by experimental results that all classifiers benefited from the feature selection. The best classification performance was obtained by the RMSVM using thirteen features selected by the RMSVM based RFE, which yielded the normalized area under the receiver operating characteristic curve (Az) of 0.859. Compared with the radiologist's performance of Az of 0.787, the developed computer-aided diagnosis algorithm has the potential to improve the diagnostic accuracy.

Introduction

Ultrasonography is the chief imaging modality to assess cervical lymph nodes. Several gray scale and power Doppler sonographic features have been found in the past for differentiation between benign and malignant cervical lymph nodes. The documented diagnostic features include size, shape, nodal border, margin, internal echo, vascular pattern, etc. [1], [2], [3], [4], [5], [6], [7]. However, in clinical practice, the interpretation of sonographic images is in general highly subjective. Interobserver variability is a concerning issue in the characterization of cervical lymph nodes in sonograms. Therefore, a computer-aided diagnostic (CAD) system is needed to help radiologists assess the lymph nodes more objectively and to improve the diagnostic accuracy. Several CAD systems have been developed as helpful tools for disease diagnosis, such as for breast [8], [9], thyroid [10] and liver diseases [11]. However, few CAD systems have been reported for the diagnosis of cervical lymph nodes on sonographic images [7], [12].

Typically, the feature extraction and classification are two major stages in a CAD system. In the previous study, we have extracted 10 kinds of sonographic features including a total of 36 quantitative features to characterize cervical lymph nodes in sonograms [12]. In the classification, the data overfitting may arise when the number of features is large and the number of training samples is relatively small (55 samples in our experiments). In such a case, even a linear decision hyper-plane can separate the training samples, but the performance on the test data will be very poor (i.e., low generalization performance). Therefore, feature selection is usually performed prior to the classification to reduce the dimensionality of the feature space, by which the risk of data overfitting may be avoided; in the meantime, the computational burden in the classification can be also reduced.

As to the generalization performance, the support vector machine (SVM) [13], [14] has been known as a powerful classification approach. The SVM tries to find a separating hyper-plane based on the structured risk minimization principle to maximize the margin between two classes. Therefore, the SVM achieves a higher generalization performance than traditional classifiers that are based on minimization of empirical risks. However, the final classifier obtained by the SVM depends only on a small part of the training samples (support vectors), which makes the SVM sensitive to noises or outliers and may also result in overfitting problem [15]. To reduce the effects of outliers in the learning process, the fuzzy SVM [16] associated a fuzzy membership to each training sample so that different samples could make different contributions to the learning of the separating hyper-plane. However, without any knowledge of the distribution of training samples, it was hard to associate the fuzzy membership to the training sample. Feng and Williams [17] proposed the scaled SVM, which utilized the means of the classes to reduce the generalization error of the SVM. In another approach called the center SVM [15], the classifier was built based on both the class center vectors and support vectors to prevent the classifier from becoming sensitive to outliers. However, if the sample distribution is non-Gaussian and highly nonconvex, the mean or the center of a class may not be representative or fall outside its class. The total margin SVM [18] considered totally the distances between all data points and the separating hyper-plane, and the generalization error might be reduced since all data information was used in the learning process. In this study, we proposed a rough margin based SVM (RMSVM), which incorporated the rough set [19] notion into the SVM to overcome the overfitting problem due to outliers. In the RMSVM, the effects of outliers could be reduced since more data points were adaptively considered rather than the few extreme value points used in the classical SVM.

Recently, it was recognized that the SVM could be used for the feature selection. Guyon et al. [20] showed that the SVM based recursive feature elimination (RFE) [21] yielded better features, in contrast with the RFE utilizing other discriminant functions, such as Fisher or the mean square error (MSE) discriminants [22].

In this paper, we applied the RMSVM based RFE to select features. Then the selected features were used to train the RMSVM. In comparison, the classical SVM and the general regression neural network (GRNN) [23] were also implemented. The performances of three classifiers, with and without the prior RFE feature selection based on the SVMs or the MSE discriminant, were evaluated by the diagnostic accuracy, the normalized area under the receiver operating characteristic (ROC) curve (Az), sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV). As shown in experimental results, all classifiers benefited from the feature selection and the best classification performance was obtained with the RMSVM using features selected by the RMSVM based RFE.

Section snippets

Subjects and extracted features

The data set consists of 110 cervical lymph nodes, including 60 benign nodes (30 inflammatory, 4 idiopathic thrombocytopenic purpura and 26 normal nodes) and 50 malignant nodes (12 metastatic, 33 lymphomatous and 5 leukemic nodes), respectively, derived from 110 different out-patients (51 men and 59 women; age range, 17–78 years; mean age, 53 years) at Huashan Hospital of Fudan University, Shanghai, from October 2005 to June 2006. The histologic characteristics of each malignant node were

Experimental results and discussions

In the experiments, we evaluated the classification performances of three classifiers and the effectiveness of the RFE feature selection methods based on the SVMs or MSE. In all, five conditions were investigated, that is, the classical ν-SVM with the SVM-RFE and MSE-RFE, the proposed RMSVM with the RMSVM-RFE and MSE-RFE, and the GRNN with the MSE-RFE. All experiments were performed on a Pentium IV, 2.8 GHz computer. The software program was implemented in MATLAB 7.0.

The whole data set (110

Conclusions

For a more objective diagnosis of cervical lymph nodes in sonograms, a rough margin based SVM classification system was developed using the optimal features selected by the RMSVM based RFE algorithm, to classify a node as malignant or benign. In the experiments, three types of classifiers, the classical ν-SVM, RMSVM and GRNN were applied to the cervical lymph nodes data set, which consisted of 36 quantitative features belonging to 10 kinds of sonographic features. Prior to the classification,

Acknowledgements

This work was supported by the National Basic Research Program of China (No. 2005CB724303), Natural Science Foundation of China (No. 30570488), Shanghai Science & Technology Development Plan (No. 054119612), and Science Foundation of Education Department of Yunnan, China (No. 6Y0042D).

References (28)

  • Y.M. Kadah et al.

    Classification algorithms for quantitative tissue characterization of diffuse liver disease from ultrasound images

    IEEE Trans. Med. Imaging

    (1996)
  • J.H. Zhang et al.

    Sonographic feature extraction of cervical lymph nodes and its relationship with segmentation methods

    J. Ultrasound Med.

    (2006)
  • C. Cortes et al.

    Support-vector networks

    Mach. Learn.

    (1995)
  • V. Vapnik

    An overview of statistical learning theory

    IEEE Trans. Neural Network

    (1999)
  • Cited by (15)

    • Probabilistic modeling of short survivability in patients with brain metastasis from lung cancer

      2015, Computer Methods and Programs in Biomedicine
      Citation Excerpt :

      The odds ratio (OR) associated with one unit change in xj is represented with e(βj). The support vector machine (SVM) is a machine learning method which is widely recognized used for classification tasks in medical problems [27–30]. The function of SVM is to find the hyper-plane that classifies the space of all possible instances into two classes and maximally separates these two classes.

    • Integrating PSONN and Boltzmann function for feature selection and classification of lymph nodes in ultrasound images

      2013, Journal of Visual Communication and Image Representation
      Citation Excerpt :

      However, low-resolution US images with various echo patterns of LN diseases can confuse physicians and the diagnosis process is time-consuming. Recently, some computer-aided methods characterizing lymph nodes in ultrasound images have been proposed [1–7]. Zhang et al. [1] presented a method to extract quantitative features for characterization of cervical lymph nodes on sonographic images and to analyze the effect of a semi-automated segmentation method on the feature extraction.

    • A hybrid system based on information gain and principal component analysis for the classification of transcranial Doppler signals

      2012, Computer Methods and Programs in Biomedicine
      Citation Excerpt :

      Recently, there is a growing interest in the use of SVM due to many attractive features and a promising empirical performance. SVM is used in many areas such as handwritten characters and digit recognition [24,25], fault diagnosis [26], forecasting [27] and biomedicine [28–30]. The main advantages of SVM include the use of kernels (no need to acknowledge the non-linear mapping function), the absence of local minima, the sparseness of solution and the generalization capability obtained by optimizing the margin [31].

    • Multi-class support vector machine for classification of the ultrasonic images of supraspinatus

      2009, Expert Systems with Applications
      Citation Excerpt :

      Kakkos et al. (2007) used spatial gray level dependence matrices and gray level run-length statistics to discriminate the diseases of symptomatic carotid plaques. Zhang, Wang, Dong, and Wang (2007) applied some sonographic features such as size, shape and echogeneity to classify the cervical lymph nodes based on a support vector machine. This article presents a new attempt to integrate different texture analysis methods to analyze the characteristic of ultrasonic supraspinatus images for classification.

    View all citing articles on Scopus
    View full text