Class-driven content-based medical image retrieval using hash codes of deep features

https://doi.org/10.1016/j.bspc.2021.102601Get rights and content

Highlights

  • The proposed approach reduces the semantic gap effectively without disturbing the high-level feature extraction process.

  • It uses the semantic information in each layer, combining the properties of the FCL's last layer and its intermediate layer.

  • Thanks to the automatical feature selection process, shorter hash codes are prodoced for medical images.

  • It has higher access performance than current state-of-the-art methods in the literature and retrieve speed is faster.

Abstract

Medical imaging provides the convenience of physicians to analyze the disease by providing visual data of the body parts required for clinical research and treatment. Today, increasing medical images following technological developments are stored for a better understanding of diseases and future diagnoses. Effective medical image indexing and retrieval systems are required to use these images from storage repositories in real-time. In this quest, this paper provides an effective indexing and retrieval framework using deep features for MR and CT image indexing and searching. The proposed system aims to produce the most effective and least parameterized hash codes by using image features. For this reason, deep features are obtained from medical images using the convolutional neural network (CNN) architecture, which is the most effective automatic feature extraction method. The length of the acquired raw deep feature vectors for an image is relatively inefficient for retrieval speed. Feature reduction methods are used for the most effective reduction of the length of the deep feature vector. The most effective feature reduction algorithm is determined in this study. The main reason for producing a reduced class-driven hash code with feature selection algorithms is the drawbacks of medical image datasets. These drawbacks prevent the CNN output from being used directly as hash-code. The performance of the proposed method is tested on NEMA MRI and NEMA CT datasets. The proposed method is able to outperform the other state-of-the-art algorithms in terms of average precision performance.

Introduction

Medical imaging, which has become one of the most critical cases in diagnosis, contains a lot of information about the disease. This information is used in different tasks such as classification of the disease, determination of the diseased area, and determining disease level [1]. Because medical images are used in many fields and purposes, vast numbers of images are produced by many medical devices [2]. Although it is beneficial for early and accurate diagnosis, it is inefficient for doctors to annotate manually these images collected from hospitals and medical centers. Because this process increases the workload of the experts and causes them to spend a lot of time. In order to overcome this inefficiency, content-based medical image retrieval (CBMIR) systems are used. CBMIR systems work based on indexing every image in a dataset and accessing it quickly when necessary [3]. In CBMIR systems, each image in a dataset is defined by a unique numerical vector according to the same feature system. Each image to be added to the data set automatically receives a binary number according to the same feature system. The image that needs to be analyzed (query image) takes a binary number vector according to the same feature system. By comparing this binary vector with all binary vectors in the dataset, the most similar images are determined automatically. This system allows many medical images to be diagnosed automatically, based on the diagnostic experience in the dataset. Also, it significantly reduces the storage load. Thus, in addition to reducing the workload of doctors, it is also used as a training system for inexperienced specialists, training of students, and a consultation system for specialists [4].

Computer-aided diagnosis (CAD) methods are widely used for the automatic analysis of medical images. CAD systems extract the features from the images and interpret these features via computer algorithms [5]. In all image processing tasks such as classification, segmentation, detection, and CBMIR in CAD methods, image features play one of the most important roles. Choosing convenience features affects the performance of image processing tasks. When the proposed methods for CBMIR systems are divided into three main sections according to their feature extraction capability, hand-crafted features, deep features, and hybrid features come to mind. The reason for this is that artificial intelligence and machine learning studies have changed over the years, and CBMIR retrieval researchers keep up with this change [6].

In the early days of image processing tasks, low-level features such as histogram, edge detection, and gray level information were used [7]. These features were experiencing significant problems in obtaining spatial information. For this reason, text-based image retrieval was more useful in the early years of retrieval systems. Although text-based retrieval systems were beneficial, they were insufficient to reduce the workload on experts. For this reason, it was inevitable to introduce more automated systems. The success of CBIR systems has been on the rise, thanks to the development of machine learning algorithms and the introduction of hand-crafted and multiple feature extraction algorithms that are useful for spatial information. Methods such as Fourier descriptors [8], curvelet features [9], Gaussian model [10], color and intensity features [11], and radon barcodes [12] are used to extract features from images. These methods are generally based on the researcher's experience and may not be suitable for every dataset. Another obstacle with hand-crafted features is that it produces mid-level or low-level features. In image retrieval problems, a semantic gap is formed between the high-level information defined by the human and the information produced by the image processing program. High-level image features are required to eliminate this 'gap'. Today, although CNN algorithms solve this problem, hand-crafted methods are still being developed, bag-of-words models are especially popular [13]. For these models, the most preferred feature extraction algorithm is gray level co-occurrence matrix (GLCM) [14], local binary patterns (LBP) [15], speeded up robust features (SURF) [16], scale-invariant feature transform (SIFT), and histogram of oriented gradients (HOG) [17]. Although developments in hand-crafted features are admirable, their performance lags behind deep learning algorithms. The most popular feature extraction method of these days for CBMIR is CNN (deep features). CNN architecture is an automated method recommended for the solution of image processing problems. It is the most striking architecture of recent times thanks to its special layers, such as convolution, pooling, ReLU, FCL, softmax, and dropout [18]. The convolution layers of the CNN architecture serve as spatial filters, enabling automatic learning of features, while the pooling layer allows for reduced size. In this way, both the high-level features are learned, and the processing load is reduced as the spatial size decreases. This architecture, which is very useful for reducing the semantic gap for CBMIR systems, has been used in almost every study in recent times. Bressan et al. [19] used the transfer learning approach for content-based mammographic image retrieval. They analyzed the role of deep features in CBMIR systems. Owais et al. [20] presented the enhanced residual network (ResNet) model for CBMIR. Ayyachamy et al. [21] used the ResNet-18 model for the CBMIR task. They retrieved computed tomography (CT), magnetic resonance imaging (MRI), mammogram (MG), and positron emission tomography (PET) image datasets. Cai et al. [22] proposed the CBMIR framework with CNN and hash coding technique. Trained with a siamese network-style without label information, this model is compatible with different loss functions. Dan et al. [23] proposed the VLAD layer in CNN network architecture to retrieve mammary cancer. Bootwala et al. [24] presented a CBMIR method for diabetic retinopathy images with VGG-19 from scratch. For CNN-based methods to produce characteristic features, they must be trained with a sufficient number of labeled samples. Although finding labeled data for many tasks is a big problem, networks trained with other datasets can be used as the transfer learning method. Transfer learning produces very satisfactory results for images containing general objects [25]. However, the objects and points of interest in images owned by medical datasets are quite different from global datasets. In this case, it is inappropriate to use the transfer learning approach. The current trend in medical domain studies is the combination of the CNN method and handcrafted methods [26]. These methods are generally based on the principle of extracting hand-crafted features in addition to CNN features and generating code by classifying them together. In another approach, it is based on the additional narrowing of feature space using hand-crafted techniques. These methods have some drawbacks to reduce the semantic gap in medical image retrieval tasks. During the evaluation of high-level features and mid-level features, the code generation process can only be attached to one group. Computational complexity increases considerably and can remain cumbersome for real-time retrieval systems. One of the most important issues in the analysis of medical images is dealing with noise. To create an effective hash code, the image features must have a strong image representation capacity. To extract the robust image features, it is necessary to deal with the noise [27]. Kumar and Diwakar [28] proposed a locally adaptive shrinkage rule in the tetrolet domain for CT image denoising. Diwakar and Kumar [29] used a non-local means (NLM) filter and correlation-based wavelet packet thresholding to eliminate the noise of medical images.

In this study, an effective framework based on deep features is presented for CBMIR systems, and the semantic gap is reduced. The proposed framework is useful for real-time applications thanks to the efficient narrowing of high-level features. In the first step, image features are extracted automatically from the medical images using the CNN architecture. Although these features are quite useful for the semantic gap problem, they include some drawbacks mentioned above. To eliminate these drawbacks, the CNN architecture used with the transfer learning approach is retrained with labeled medical images. After the CNN architecture is adequately trained for the classification task, feature vectors are obtained for each image from the FCL layer. The lengths of these feature vectors are inefficient for real-time retrieval task. These feature vectors are quite long for a fast retrieval process. For this reason, durable but long feature vectors at the layers of FCL are reduced by feature reduction algorithms. In the last step, the 10-parameter feature vector in the last layer of FCL is combined with the reduced features with the help of the feature selection algorithm. Finally, these feature vectors are converted into binary form. The major contributions of this paper can be summarized as follows:

  • The proposed approach reduces the semantic gap effectively without disturbing the high-level feature extraction process based on deep features.

  • It combines both the classifier information and the retrieval information, combining the features of the FCL's last layer and the intermediate layer.

  • Thanks to the feature selection, the most effective features are automatically selected. In this way, shorter hash codes are obtained.

  • It has higher retrieval performance than other methods in the literature. Additionally, the retrieval process is faster, thanks to shorter hash codes.

The rest of this paper is organized as follows; Section 2 describes the introduction of difficulties and parameters of the proposed method. Details of the datasets, experimental results and discussion of the results are given in Section 3. Finally, results are presented in Section 4.

Section snippets

Overview of the proposed framework

Access speed and retrieval accuracy are the two most essential parameters for CBMIR methods. Ongoing research focuses on their performance to improve these two parameters. The more extended access code is the most plausible solution for increasing retrieval accuracy. However, in this case, it will not be appropriate to use it in real-time systems since the access speed will slow down considerably. Besides, access speed in large datasets is a genuine problem. Image access codes should be as

Experiments and experimental results

The proposed method is trained on a computer with Intel Core i7−7700 K CPU (4.2 GHz), 32 GB DDR4 RAM, and NVIDIA GeForce GTX 1080 graphic card.

Discussion

A system retrieval time usually depends on the length of the feature vector. In other words, the hash depends on the number of bits of the code. The retrieval process performed with longer hash codes produces a higher performance precision score, while the access speed is slower. While shorter hash codes reach results relatively quickly, average precision performance may not be satisfactory. Comparing retrieval speeds using the same length hash codes is inefficient. Because almost the same

Conclusion

In this study, a framework based on high-level deep features is presented for CBMIR. Due to the difficulty of finding a medical dataset containing a sufficient number of tagged images, it is often necessary to retrieve such datasets with hand-crafted features. But hand-crafted features do not reduce the semantic gap. High-level features must be obtained for the reduction of the semantic gap. The most critical obstacle for this is the small number of images. A class-driven retrieval approach is

CRediT authorship contribution statement

Şaban ÖZTÜRK: Conceptualization, Software, Methodology, Resources, Data Curation, Validation, Formal analysis, Investigation, Writing, Visualization.

Human and animal rights

The paper does not contain any studies with human participants or animals performed by any of the authors.

Acknowledgment

This research is funded by Scientific and Technological Research Council of Turkey (TÜBİTAK) under grant number 120E018.

Declaration of Competing Interest

The authors declare that they have no conflicts of interest.

References (42)

  • J. Tang et al.

    Supervised deep hashing for scalable face image retrieval

    Pattern Recognit.

    (2018)
  • R. Hatibaruah et al.

    Local bit plane adjacent neighborhood dissimilarity pattern for medical CT image retrieval

    Proced. Comput. Sci.

    (2019)
  • A. Aggarwal et al.

    A new approach for effective retrieval and indexing of medical images

    Biomed. Signal Process. Control

    (2019)
  • M. Kumar et al.

    A new exponentially directional weighted function based CT image denoising using total variation

    J. King Saud Univ. Comput. Inform. Sci.

    (2019)
  • M. Diwakar et al.

    CT image denoising using multivariate model and its method noise thresholding in non-subsampled shearlet domain

    Biomed. Signal Process. Control

    (2020)
  • M.K. Alsmadi

    Content-based image retrieval using color, shape and texture descriptors and features

    Arab. J. Sci. Eng.

    (2020)
  • A. Latif

    Content-based image retrieval and feature extraction: a comprehensive review

    Math. Probl. Eng.

    (2019)
  • P. Das et al.

    An overview of approaches for content-based medical image retrieval

    Int. J. Multimed. Inf. Retr.

    (2017)
  • X. Jianhua et al.

    Segmentation of magnetic resonance brain image: integrating region growing and edge detection

    Presented at the Proceedings., International Conference on Image Processing

    (1995)
  • G. Zhang et al.

    Shape feature extraction using fourier descriptors with brightness in content-based medical image retrieval

    Presented at the 2008 International Conference on Intelligent Information Hiding and Multimedia Signal Processing

    (2008)
  • P.N.R.L.C. Chandra Chandra et al.

    Image retrieval with rotation invariance

    Presented at the 2011 3rd International Conference on Electronics Computer Technology

    (2011)
  • Cited by (37)

    • Small object detection using deep feature learning and feature fusion network

      2024, Engineering Applications of Artificial Intelligence
    • Balance label correction using contrastive loss

      2022, Information Sciences
      Citation Excerpt :

      Modern deep neural networks (DNNs) have achieved remarkable success in many machine learning tasks [1,2].

    View all citing articles on Scopus
    View full text