Leukemia diagnosis in blood slides using transfer learning in CNNs and SVM for classification

https://doi.org/10.1016/j.engappai.2018.04.024Get rights and content

Highlights

  • An approach for leukemia diagnosis is proposed using transfer learning and SVM classifier.

  • We extracted features using transfer learning due to the lack of a large database.

  • We created three hybrid image databases for the method validation.

  • In our experiments, the classifier SVM outperformed the KNN, MLP and Random Forest.

  • The achieved results overcome the state-of-art methods.

Abstract

Leukemia is a pathology that affects young people and adults, causing premature death and several other symptoms. Computer-aided systems can be used to reduce the possibility of prescribing inappropriate treatments and assist specialists in the diagnosis of this disease. There is a growing use of Convolutional Neural Networks (CNNs) in the classification and diagnosis of medical image problems. However, the training of CNNs requires a large set of images. To overcome this problem, we use transfer learning to extract images features for further classification. We tested three state-of-the-art CNN architectures and the features were selected according to their gain ratios and used as input to the Support Vector Machine classifier. The proposed methodology aims to correctly classify images with different characteristics derived from different image databases and does not require a segmentation process. We built a new database from the union of three distinct databases presented in the literature to validate the proposed methodology. The proposed methodology achieved hit rates above 99% and outperformed nine methods found in the literature.

Introduction

Over the years, multiple medical aid systems have been proposed. Among the diseases aided by computer systems, leukemia is the one that has the highest number of fatalities among adolescents and children, and the risk of developing it is higher in children up to five years of age. Leukemia is a type of cancer that originates in the bone marrow (Fig. 1, Fig. 1(a)) and causes abnormal proliferation of white blood cells (Fig. 1b). To diagnose leukemia, specialists can carry out various tests and exams, including physical examinations, blood tests, blood counts, myelograms, lumbar punctures and bone marrow biopsies. Microscopic analysis is the most economical method of carrying out the initial screening of patients with leukemia. This type of test is done manually, which may generate fatigue in operators. Therefore, a low-cost system that is automatic and robust is necessary to avoid the operator’s influence.

Many computer-aided diagnosis systems were developed with the use of image processing and computational intelligence techniques. These systems usually have some steps such as: preprocessing, segmentation, feature extraction, and classification. Feature extraction and classification are the steps that best define the diagnosis performed by computer-aided diagnosis systems. However, to achieve better results, a proper segmentation can provide an adequate feature extraction and consequently a reasonable classification.

In this work, we propose a leukemia diagnosis system that does not require the segmentation process (commonly used in state-of-the-art techniques). The methodology uses pre-trained Convolutional Neural Network (CNN) models (AlexNet Krizhevsky et al., 2012, Vgg-f Chatfield et al., 2014 and CaffeNet Jia et al., 2014) to extract features directly from the images without any previous preprocessing. Then, the obtained features will be used for the following classification with a Support Vector Machine (SVM) (Cortes and Vapnik, 1995). We used three hybrid datasets to evaluate the performance of the methodology, one with blood smears containing only one leukocyte per image, one with many leukocytes per image, and the last one with both types of images. To demonstrate the robustness of our approach, we compared the results obtained by our methodology with other state-of-the-art methods.

The remainder of the paper is organized as follows: Related works are presented in Section 2, and the proposed method is introduced in Section 3. In Section 4, we present the experiments and also describe the image datasets used in the tests and the evaluation of the method. We discuss the results in Section 5. Finally, the conclusions and perspectives on future works are given in Section 6.

Section snippets

Related works

Several methods for leukemia detection have been proposed over the years and some of these works presented solutions to the classification of the two most common types of leukemia: Acute Myeloid Leukemia (AML) and Acute Lymphoblastic Leukemia (ALL). Some works presenting state-of-the-art technologies only provide diagnosis by using images with one leukocyte per image Mohaprata et al., 2011, Mohapatra et al., 2014, Neoh et al., 2015, Putzu et al., 2014, Singhal and Singh, 2016 and others that

Proposed methodology

The method proposed in this work aims to diagnose leukemia using blood smear images. Following the flowchart shown in Fig. 2, it is possible to observe that the system uses an image without any preprocessing or segmentation as input. This is the main difference between our method and the state-of-the-art methods. The CNNs are used to describe the input image and the features are selected and reduced. In the classification step, the SVM is used to classify the images as pathological or not.

Experiments

To evaluate the obtained results, we used two image databases, one containing only one leukocyte per image and the other with many leukocytes. Also, we used four classic metrics from the literature: precision, accuracy, recall, and the Kappa index. We implemented the feature extraction in MATLAB and the feature selection and classification using the WEKA tool (Hall et al., 2009).

Results and discussions

We followed two approaches during the development of the proposed methodology: by using the outputs of each one of the three architecture separately and by using the concatenation of all vectors. We also evaluate their results individually, comparing them with existing state-of-the-art works. We made this evaluation because there are no studies that classify blood smears with multiples and with only one leukocyte in the same image, as the one proposed in this work.

We carried out empirical tests

Conclusion

The work presented in this paper describes a new methodology for the diagnosis of leukemia in blood images using Convolutional Neural Networks (CNNs). Based on the results obtained by the proposed approach, it was possible to validate the robustness of pre-trained CNNs for extracting features in relation to classical state-of-the-art methods. Through the selection of attributes, we observed that more characteristics are required to classify the images with many leukocytes, while fewer features

Acknowledgments

The authors would like to thank the Brazilian National Counsel of Technological and Scientific Development (CNPq) (136240/2016-0) and the Federal University of Piauí (UFPI) for sponsoring our research.

References (40)

  • PatelN. et al.

    Automated leukaemia detection using microscopic images

    Procedia Comput. Sci.

    (2015)
  • PutzuL. et al.

    Leucocyte classification for leukaemia detection using image processing techniques

    Artif. Intell. Med.

    (2014)
  • SchwenkerF. et al.

    Three learning phases for radial-basis-function networks

    Neural Netw.

    (2001)
  • AgaianS. et al.

    A new acute leukaemia-automated classification system

    Comput. Meth. Biomech. Biomed. Eng.: Imaging Vis.

    (2018)
  • AhaD. et al.

    Instance-based learning algorithms

    Mach. Learn.

    (1991)
  • Athiwaratkun, B., Kang, K., 2015. Feature representation in convolutional neural networks, CoRR abs/1507.02313....
  • BreimanL.

    Random forests

    Mach. Learn.

    (2001)
  • Castelluccio, M., Poggi, G., Sansone, C., Verdoliva, L., 2015. Land use classification in remote sensing images by...
  • Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A., 2014. Return of the devil in the details: delving deep into...
  • chen HeD. et al.

    Texture unit, texture spectrum, and texture analysis

    IEEE Trans. Geosci. Remote Sens.

    (1990)
  • CortesC. et al.

    Support-vector networks

    Mach. Learn.

    (1995)
  • FriedmanN. et al.

    Bayesian network classifiers

    Mach. Learn.

    (1997)
  • GuyonI. et al.

    An introduction to feature extraction

  • HallM. et al.

    The weka data mining software: An update

    SIGKDD Explor. Newslett.

    (2009)
  • HaralickR.M. et al.

    Texture features for image classification

    IEEE Trans. Syst. Man Cybern.

    (1973)
  • JiaY. et al.

    Caffe: Convolutional architecture for fast feature embedding

  • KrizhevskyA. et al.

    Imagenet classification with deep convolutional neural networks

  • KumarA. et al.

    An ensemble of fine-tuned convolutional neural networks for medical image classification

    IEEE J. Biomed. Health Inform.

    (2016)
  • Labati, R.D., Piuri, V., Scotti, F., 2011. All-idb: The acute lymphoblastic leukemia image database for image...
  • LecunY. et al.

    Deep learning

    Nature

    (2015)
  • Cited by (0)

    View full text