Elsevier

Applied Soft Computing

Volume 111, November 2021, 107707
Applied Soft Computing

A novel weight pruning strategy for light weight neural networks with application to the diagnosis of skin disease

https://doi.org/10.1016/j.asoc.2021.107707Get rights and content

Highlights

  • Utilize a variety of lightweight CNNs for automated skin cancer diagnosis.

  • Propose a weight pruning strategy for accuracy improvement in lightweight CNNs.

  • The relationship between weight value and classification accuracy was studied.

  • Collect 11 different skin disease types over the past ten years as our dataset.

Abstract

In recent years, deep learning-based models have achieved significant advances in manifold computer vision problems, but tedious parameter tuning has complicated their application to computer-aided diagnostic (CAD) systems. As such, this study introduces a novel pruning strategy to improve the accuracy of five lightweight deep convolutional neural network (DCNN) architectures applied to the classification of skin disease. Unlike conventional pruning methods (such as optimal brain surgeon), the proposed technique does not change the model size yet improves performance after fine tuning. This training approach, intended to improve accuracy without increasing model complexity, is experimentally verified using 1167 pathological images. The clinical data included 11 different skin disease types collected over the past ten years, with varying image quantities in each category. A novel hierarchical pruning method, based on standard deviation, is then developed and used to prune parameters in each convolution layer according to the different weight distributions. This training strategy achieves an 83.5% Top-1 accuracy using a pruned MnasNet (12.5 MB), which is 1.8% higher than that of unpruned InceptionV3 (256 MB). Comparative experiments using other networks (MobileNetV2, SqueezeNet, ShuffleNetV2, Xception, ResNet50, DenseNet121) and dataset (HAM10000) also demonstrate consistent improvements when adopting the proposed model training technique. This distinctive robustness across various network types and simple deployment demonstrates the potential of this methodology for generalization to other computer vision tasks.

Introduction

Although Asians exhibit relatively low occurrences of non-melanoma skin cancers (NMSCs), basal cell carcinoma (BCC) remains one of the most common cancers in those of Chinese descent. As the largest organ in the human body, skin protects several other vital systems, which increases its risk for certain diseases. Skin cancer is a group of diseases that collectively makes up the most common malignancy in the United States. Due in part to increased occupational and recreational ultraviolet (UV) exposure, incidence of both malignant and benign skin tumors is increasing, especially among those with light-pigmented skin [1], [2]. Fortunately, early screenings for skin cancer can significantly improve the prognosis. During the diagnosis of skin cancer, dermatologists typically make a preliminary judgment based on the location of the disease and the appearance of the lesion, using non-invasive inspection methods such as dermoscopy. However, false-positive and false-negative diagnoses occur often in the detection of tumors [3]. The large variety of skin diseases necessitates pathological examination still as the ‘gold standard’ for determining whether a patient is diagnosed with cancer or an inflammatory skin disease.

Although pathological diagnosis offers higher accuracy, it is challenging to acquire valid information from histopathological images [4]. A routine examination requires physicians to combine a patient’s physical characteristics (such as irregular markings on the skin or hard lumps on the body) with high-resolution histopathological images for final diagnosis. In addition, government statistics indicate a severe shortage of dermatologists in underdeveloped regions of China as many smaller hospitals lack a professional dermatologist or pathologist. A recent study estimated the proportion of professional dermatologists to patients is as large as 1:60000 [5]. For this reason, the cost and treatment time for pathological diagnosis has increased significantly. How to shorten the diagnosis cycle and reduce human resource costs has become an urgent problem in many countries’ medical systems.

In recent years, the development of deep learning and the successful application of artificial convolutional neural networks (CNNs) to various computer vision tasks [6], [7] had increased interest in medical image classification [8]. A reliable computer-aided diagnostic (CAD) system, based on deep learning, would be highly beneficial for histopathological image analysis [9]. A system of this type would not only reduce workloads for doctors, mitigating the lack of medical resources in rural areas, but can also be used as a tool to train novice dermatologists and pathologists [10]. As a result, it could help to improve the accessibility of social medical care and the quality of life for skin cancer patients.

Multiple studies have investigated the use of machine learning for automated skin disease diagnosis in recent years. For example, in 2014, a multilayer feed-forward artificial neural network was developed and trained to diagnose erythemato-squamous diseases with an accuracy of 93.7% [11]. In addition to binary classification, deep neural networks have also been used for multi-classification tasks. Liao et al. introduced lesion-targeted CNNs for skin disease classification in 2016 [12]. Sun et al. classified 128 skin diseases using CaffeNet and VGGNet [13]. Haenssle et al. demonstrated that a CNN provided with level-I or level-II dermoscopy images (embedded with clinical information) outperformed a control group of physicians (75.7% compared with 82.5%, P < 0.01) [14]. In 2019, the MobileNet model was used for classifying seven classes of skin lesion image and achieved an overall accuracy of 83.1%, which was comparable to skin pathologist [15]. In the same year, Dawid Połap et al. introduced a set of medical diagnosis model based on artificial intelligence [16], including a series of image preprocessing methods, key-point search algorithm and modified CNNs, and this research set a benchmark for the development of CAD. After that, based on genetic algorithm and a cascade of binary convolutional neural networks, Połap proposed a more efficient solution for the analysis of microscopy images and explored a new idea to train large networks [17].

The implementation of deep neural networks can be challenging for image processing tasks as CNN-based models typically include a complex structure with potentially hundreds of layers and large quantities of weight parameters. In addition, the number of calculations involved requires large amounts of memory and powerful processors, which can be cost-prohibitive. These limitations make it difficult to deploy a CNN framework for CAD in smaller clinics. As such, researchers have developed a variety of lightweight CNNs for mobile and embedded devices, which can significantly reduce the size of the model and improve computational efficiency [18]. However, this streamlined architecture necessitates that smaller CNNs sacrifice accuracy and require increased manual tuning during training, which increases model implementation costs. Thus, the development of an efficient training strategy could be a promising solution to compensate for the loss of accuracy, allowing CAD systems to operate with high reliability and low inference.

Recently, the problem of network performance improvement becomes a hot research topic, and there exist a series of approaches trying to explore new training strategies toward this task. In 2018, a born-again network was proposed for training a student CNN with a same or similar architecture to teacher model [19]. Different from Hinton’s research, this strategy cannot slim the heavy CNNs but make the student-BAN more effective than the teacher model. In CIFAR-10 and CIFAR-100 dataset, BAN-DenseNet reaches error rates of 3.5% and 15.5% respectively. Furlanello et al. show that small student network has untapped learning potential, which open a new door to improve the efficiency of lightweight CNNs. Based on this idea, a study in 2020 embeds pre-trained CNNs into a BAN framework and achieves the state-of-the-art accuracy [20]. In that paper, a series of theoretical analysis and experiments were developed to show that in meta-learning and few-shot image classification, self-distillation training with self-supervised method achieved a similar performance as supervised learning. However, this self-distillation method [19], [20] requires a lot of computing resources and it is hard to apply same configuration of hyperparameters to different datasets (it will be described in detail in Section 4). Thus, designing a training strategy that is easy to deploy and meanwhile has strong generalization ability is a challenging issue especially for deep model training with light-weight model parameters or small-scale dataset.

The objective of this study is to utilize a variety of lightweight CNNs for automated skin cancer diagnosis. A novel weight- pruning training strategy is developed by us to compensate for a loss of accuracy and further improve both model performance and reliability. Conventional lightweight CNNs are first introduced, which exhibit good performance for computer vision tasks, including SqueezeNet [21], MnasNet [22], MobileNetV2 [23], ShuffleNetV2 [24] and Xception [25]. To compensate for the under-fitting and high bias exhibited by small CNNs, we have included a series of improvements based on the dense–sparse–dense (DSD) training strategy proposed by Han [26]. Finally, a comprehensive analysis of weight parameter variations during training was used to develop a new pruning strategy by not only pruning connections with smaller or larger correlations but also investigating a pruning mechanism based on the standard deviation of the weights. This novel pruning mechanism can adaptively remove larger or smaller weights according to the distribution in each layer. Results show that this new training strategy further optimizes the weight parameters and allows lightweight CNNs to achieve a performance comparable to CNNs with larger layer quantities. The primary contributions of this study can be summarized as follows:

  • Data used in the comparative experiments have been collected by our group during clinical diagnostic tests performed over the last ten years, in cooperation with a dermatosis hospital in Dalian, China. A series of 1167 pathological images are selected for the study, which included 11 different skin disease types. Each image is verified pathologically by a professional dermatologist, to ensure all labels are accurate and reliable. In order to verify the trainability of the models on our dataset, we also introduce a publicly available HAM10000 dataset, ISIC 2018 Lesion Diagnosis Challenge [27] to test the model inference on a larger dataset.

  • Five popular lightweight CNNs are used for the automated classification of multi-type skin tumors based on pathological images. The inclusion of linear bottlenecks (in the case of MobileNetV2 and MnasNet) and the replacement of 3x3 filters with 1 × 1 filters (SqueezeNet) significantly improve the accuracy-latency tradeoff observed in conventional CNNs. Besides, ShuffleNetV2 [24] changes the structure of shortcut from “Add” to “Concat” on the basis of DenseNet [28] and introduce an operator called “channel split”. These results indicate that parameter quantities can be reduced significantly with little loss of accuracy. In addition, Dr. Dong et al. use InceptionV3 for feature extracting in cervical cell dataset and achieve an accuracy of more than 98% [29]. In this structure, convolutions are replaced with larger spatial filters in two smaller layers, thereby avoiding bottlenecks and extreme compression. InceptionV3 [30], ResNet50 [31] and DenseNet121 [28] respectively represent three directions to improve the accuracy: widening layers, deepening network, and multi-scale feature combination. These characteristics are comparable to a conventional CNN so that we employ them as baseline models as well as five lightweight networks.

  • A novel weight parameter pruning strategy is then proposed to compensate for the loss of accuracy commonly found in lightweight CNNs. This approach involves not only pruning the unimportant CAD weights, but also retraining the model to achieve improved performance. It is observed in this study that both the smallest and largest partial weights had a negative effect on classification results, which suggests it is beneficial to simultaneously remove the maximum and minimum weights in the model. Compared with the DSD strategy [26], the primary advantage of this technique is its ability to adopt an appropriate pruning threshold for each layer based on the weight distribution.

  • The relationship between the relative size of weight values and the classification accuracy is also investigated. Experiments show that excessively small or large weights can affect model performance. The mean and standard deviation of weights are monitored and used to adjust the pruning threshold for connections in different layers. When this distribution is more concentrated and closer to 0, smaller weights were removed and vice versa. This novel approach is then implemented with four lightweight models and used to train a series of high-performance classifiers. Comparisons with deep CNNs suggest the performance of this lightweight model (83.3% Top-1 accuracy with a pruned MnasNet) approach or even surpass that of common deep CNNs (81.5% Top-1 accuracy with unpruned InceptionV3). It is proved that this strategy has highly robust for different hyper-parameter configurations and exhibits positive effects for both lightweight and heavier models.

The remainder of this paper is organized as follows. Related works are briefly reviewed in Section 2. Details concerning the dataset and formulation of our methodology are presented in Section 3. Section 4 describes the testing configuration and results of a comparative experiment. A comprehensive discussion of these results is provided in Section 6, to demonstrate the effectiveness of the proposed technique. Finally, conclusions are summarized in Section 7.

Section snippets

Development of deep learning for dermatology diagnosis

Deep learning has been applied to multiple dermatological diagnosis scenarios. In 2003, Schmid-Saugeona et al. developed a CAD system for extracting pigmented lesion features, achieving a higher accuracy than similar methods based on principal component decomposition [32]. In 2017, the InceptionV3 deep learning architecture trained with a dataset of 129,450 clinical images (including 3374 dermoscopy images), was introduced for classifying keratinocyte carcinomas and malignant melanomas. The

Methodology

This section describes details of our weight-pruning training strategy. We first briefly introduce four naive weights pruning methods (Sections 3.1–3.4) and by integrating effective part of these naive methods, our hierarchical pruning method based on standard deviation (HP-SD) is formulated and described in Section 3.5.

Datasets

Most of the deep learning models are designed on a balanced dataset, such as ImageNet. In the field of medical diagnosis, the incidence rate of different diseases differs widely. For our case, we first make an in-house image set containing 1167 skin histopathological images, consisting of 11 skin disease types (SH-11 category) with high lethality or pervasiveness are selected from diagnostic samples collected over the past ten years. As far as we know, it is the largest skin

Results of the threshold-based method

In this part, the threshold method is used to prune five deep learning models and the resulting improvements in loss function and accuracy are listed in Table 3. An optimal threshold of 0.05 is identified by monitoring the improvement of accuracy for SqueezeNet, MnasNet, and MobileNetV2. Similarly, thresholds of 0.005 and 0.08 are selected for Xception and InceptionV3, respectively. It is difficult to find a suitable threshold range for different models and these thresholds are the optimal

Weight-pruning helps to overcome saddle points

Though popular, random weight initialization exhibits some serious shortcomings. Without a carefully selected random distribution, neuron output tends rapidly to zero with an increasing number of layers, which results in a vanishing gradient problem. In addition, weights are mostly initialized at the beginning of a typical training strategy, making it difficult for the model to escape saddle points once it has stalled. Adaptable initialization can be promoted by selectively eliminating useless

Conclusion

This study applies a novel pruning strategy to machine learning-based classification of a color histopathological image dataset containing samples of 11 different skin diseases. The high-performance computing requirements of conventional deep learning-based CAD systems are addressed using five lightweight CNNs (SqueezeNet, MnasNet, MobileNetV2, ShuffleNetV2 and Xception), which are compared to three deep CNNs (InceptionV3, ResNet50 and DenseNet121). In addition, four different training

CRediT authorship contribution statement

Kun Xiang: Conceptualization, Methodology, Software, Validation, Writing – original draft. Linlin Peng: Data curation, Visualization, Writing – review & editing. Haiqiong Yang: Formal analysis, Investigation, Data curation. Mingxin Li: Writing – review & editing. Zhongfa Cao: Validation. Shancheng Jiang: Software, Data curation, Visualization, Writing – review & editing, Project administration, Funding acquisition, Resources. Gang Qu: Formal analysis, Investigation, Resources.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work is supported in part by the National Nature Science and Foundation of China Grand No. 71801031, in part by the Guangdong Basic and Applied Basic Research Foundation project, China, “Research on dermatosis automatic diagnosis system based on multi-type of medical image”, No. 2019A1515011962, in part by the Fundamental Research Funds for the Central Universities, China No. 20lgpy183, and in part by the Science and Technology Innovation Strategy Foundation of Guangdong Province, China ,

References (48)

  • NalepaJ. et al.

    Towards resource-frugal deep convolutional neural networks for hyperspectral image segmentation

    Microprocess. Microsyst.

    (2020)
  • Gray-SchopferV. et al.

    Melanoma biology and new targeted therapy

    Nature

    (2007)
  • PapageorgiouV.

    The limitations of dermoscopy: false-positive and false-negative tumours

    J. Eur. Acad. Dermatol. Venereol.

    (2018)
  • MasoodA. et al.

    Computer aided diagnostic support system for skin cancer: a review of techniques and algorithms

    Int. J. Biomed. Imaging

    (2013)
  • y. ZhaoX.

    The application of deep learning in the risk grading of skin tumors for patients using clinical images

    J. Med. Syst.

    (2019)
  • KrizhevskyA. et al.

    Imagenet classification with deep convolutional neural networks

  • ShenD. et al.

    Deep learning in medical image analysis

    Annu. Rev. Biomed. Eng.

    (2017)
  • LitjensG.

    Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis

    Sci. Rep.

    (2016)
  • D. Filimon, A. Albu, Skin diseases diagnosis using artificial neural networks, in: 2014 IEEE 9th IEEE International...
  • L. Haofu, L. Yuncheng, L. Jiebo, Skin disease classification versus skin lesion characterization: Achieving robust...
  • SunX. et al.

    A Benchmark for Automatic Visual Classification of Clinical Skin Disease Images

    (2016)
  • ChaturvediS.S. et al.

    Skin lesion analyser: An efficient seven-way multi-class skin cancer classification using MobileNet

    (2019)
  • PołapD.

    Analysis of skin marks through the use of intelligent things

    IEEE Access

    (2019)
  • HowardA.G.

    Mobilenets: Efficient convolutional neural networks for mobile vision applications

    (2017)
  • Cited by (12)

    • SeNPIS: Sequential Network Pruning by class-wise Importance Score

      2022, Applied Soft Computing
      Citation Excerpt :

      Specifically in the latter, CNNs (Convolutional Neural Networks) have outperformed classical image processing techniques for these types of tasks [2]. As a result, CNNs are used in numerous computer vision applications, such as medical image classification [3], autonomous driving [4], image and video Forensics [5,6], and video surveillance systems [7]. However, a disadvantage of CNNs is the inference times of the trained models, which depend on the selected hardware and the number of floating point operations (FLOPs) to be performed [8].

    • Prediction of skin cancer using convolutional neural network (cnn)

      2023, Neuromorphic Computing Systems for Industry 4.0
    • Toward Robust Diagnosis: A Contour Attention Preserving Adversarial Defense for COVID-19 Detection

      2023, Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023
    • Hot Spot Recognition Method of Photovoltaic Infrared Thermal Image Based on Improved Selfish Herd Algorithm

      2022, Zhongguo Dianji Gongcheng Xuebao/Proceedings of the Chinese Society of Electrical Engineering
    View all citing articles on Scopus
    1

    Contributed equally to this work.

    View full text