Class-driven content-based medical image retrieval using hash codes of deep features
Introduction
Medical imaging, which has become one of the most critical cases in diagnosis, contains a lot of information about the disease. This information is used in different tasks such as classification of the disease, determination of the diseased area, and determining disease level [1]. Because medical images are used in many fields and purposes, vast numbers of images are produced by many medical devices [2]. Although it is beneficial for early and accurate diagnosis, it is inefficient for doctors to annotate manually these images collected from hospitals and medical centers. Because this process increases the workload of the experts and causes them to spend a lot of time. In order to overcome this inefficiency, content-based medical image retrieval (CBMIR) systems are used. CBMIR systems work based on indexing every image in a dataset and accessing it quickly when necessary [3]. In CBMIR systems, each image in a dataset is defined by a unique numerical vector according to the same feature system. Each image to be added to the data set automatically receives a binary number according to the same feature system. The image that needs to be analyzed (query image) takes a binary number vector according to the same feature system. By comparing this binary vector with all binary vectors in the dataset, the most similar images are determined automatically. This system allows many medical images to be diagnosed automatically, based on the diagnostic experience in the dataset. Also, it significantly reduces the storage load. Thus, in addition to reducing the workload of doctors, it is also used as a training system for inexperienced specialists, training of students, and a consultation system for specialists [4].
Computer-aided diagnosis (CAD) methods are widely used for the automatic analysis of medical images. CAD systems extract the features from the images and interpret these features via computer algorithms [5]. In all image processing tasks such as classification, segmentation, detection, and CBMIR in CAD methods, image features play one of the most important roles. Choosing convenience features affects the performance of image processing tasks. When the proposed methods for CBMIR systems are divided into three main sections according to their feature extraction capability, hand-crafted features, deep features, and hybrid features come to mind. The reason for this is that artificial intelligence and machine learning studies have changed over the years, and CBMIR retrieval researchers keep up with this change [6].
In the early days of image processing tasks, low-level features such as histogram, edge detection, and gray level information were used [7]. These features were experiencing significant problems in obtaining spatial information. For this reason, text-based image retrieval was more useful in the early years of retrieval systems. Although text-based retrieval systems were beneficial, they were insufficient to reduce the workload on experts. For this reason, it was inevitable to introduce more automated systems. The success of CBIR systems has been on the rise, thanks to the development of machine learning algorithms and the introduction of hand-crafted and multiple feature extraction algorithms that are useful for spatial information. Methods such as Fourier descriptors [8], curvelet features [9], Gaussian model [10], color and intensity features [11], and radon barcodes [12] are used to extract features from images. These methods are generally based on the researcher's experience and may not be suitable for every dataset. Another obstacle with hand-crafted features is that it produces mid-level or low-level features. In image retrieval problems, a semantic gap is formed between the high-level information defined by the human and the information produced by the image processing program. High-level image features are required to eliminate this 'gap'. Today, although CNN algorithms solve this problem, hand-crafted methods are still being developed, bag-of-words models are especially popular [13]. For these models, the most preferred feature extraction algorithm is gray level co-occurrence matrix (GLCM) [14], local binary patterns (LBP) [15], speeded up robust features (SURF) [16], scale-invariant feature transform (SIFT), and histogram of oriented gradients (HOG) [17]. Although developments in hand-crafted features are admirable, their performance lags behind deep learning algorithms. The most popular feature extraction method of these days for CBMIR is CNN (deep features). CNN architecture is an automated method recommended for the solution of image processing problems. It is the most striking architecture of recent times thanks to its special layers, such as convolution, pooling, ReLU, FCL, softmax, and dropout [18]. The convolution layers of the CNN architecture serve as spatial filters, enabling automatic learning of features, while the pooling layer allows for reduced size. In this way, both the high-level features are learned, and the processing load is reduced as the spatial size decreases. This architecture, which is very useful for reducing the semantic gap for CBMIR systems, has been used in almost every study in recent times. Bressan et al. [19] used the transfer learning approach for content-based mammographic image retrieval. They analyzed the role of deep features in CBMIR systems. Owais et al. [20] presented the enhanced residual network (ResNet) model for CBMIR. Ayyachamy et al. [21] used the ResNet-18 model for the CBMIR task. They retrieved computed tomography (CT), magnetic resonance imaging (MRI), mammogram (MG), and positron emission tomography (PET) image datasets. Cai et al. [22] proposed the CBMIR framework with CNN and hash coding technique. Trained with a siamese network-style without label information, this model is compatible with different loss functions. Dan et al. [23] proposed the VLAD layer in CNN network architecture to retrieve mammary cancer. Bootwala et al. [24] presented a CBMIR method for diabetic retinopathy images with VGG-19 from scratch. For CNN-based methods to produce characteristic features, they must be trained with a sufficient number of labeled samples. Although finding labeled data for many tasks is a big problem, networks trained with other datasets can be used as the transfer learning method. Transfer learning produces very satisfactory results for images containing general objects [25]. However, the objects and points of interest in images owned by medical datasets are quite different from global datasets. In this case, it is inappropriate to use the transfer learning approach. The current trend in medical domain studies is the combination of the CNN method and handcrafted methods [26]. These methods are generally based on the principle of extracting hand-crafted features in addition to CNN features and generating code by classifying them together. In another approach, it is based on the additional narrowing of feature space using hand-crafted techniques. These methods have some drawbacks to reduce the semantic gap in medical image retrieval tasks. During the evaluation of high-level features and mid-level features, the code generation process can only be attached to one group. Computational complexity increases considerably and can remain cumbersome for real-time retrieval systems. One of the most important issues in the analysis of medical images is dealing with noise. To create an effective hash code, the image features must have a strong image representation capacity. To extract the robust image features, it is necessary to deal with the noise [27]. Kumar and Diwakar [28] proposed a locally adaptive shrinkage rule in the tetrolet domain for CT image denoising. Diwakar and Kumar [29] used a non-local means (NLM) filter and correlation-based wavelet packet thresholding to eliminate the noise of medical images.
In this study, an effective framework based on deep features is presented for CBMIR systems, and the semantic gap is reduced. The proposed framework is useful for real-time applications thanks to the efficient narrowing of high-level features. In the first step, image features are extracted automatically from the medical images using the CNN architecture. Although these features are quite useful for the semantic gap problem, they include some drawbacks mentioned above. To eliminate these drawbacks, the CNN architecture used with the transfer learning approach is retrained with labeled medical images. After the CNN architecture is adequately trained for the classification task, feature vectors are obtained for each image from the FCL layer. The lengths of these feature vectors are inefficient for real-time retrieval task. These feature vectors are quite long for a fast retrieval process. For this reason, durable but long feature vectors at the layers of FCL are reduced by feature reduction algorithms. In the last step, the 10-parameter feature vector in the last layer of FCL is combined with the reduced features with the help of the feature selection algorithm. Finally, these feature vectors are converted into binary form. The major contributions of this paper can be summarized as follows:
- •
The proposed approach reduces the semantic gap effectively without disturbing the high-level feature extraction process based on deep features.
- •
It combines both the classifier information and the retrieval information, combining the features of the FCL's last layer and the intermediate layer.
- •
Thanks to the feature selection, the most effective features are automatically selected. In this way, shorter hash codes are obtained.
- •
It has higher retrieval performance than other methods in the literature. Additionally, the retrieval process is faster, thanks to shorter hash codes.
The rest of this paper is organized as follows; Section 2 describes the introduction of difficulties and parameters of the proposed method. Details of the datasets, experimental results and discussion of the results are given in Section 3. Finally, results are presented in Section 4.
Section snippets
Overview of the proposed framework
Access speed and retrieval accuracy are the two most essential parameters for CBMIR methods. Ongoing research focuses on their performance to improve these two parameters. The more extended access code is the most plausible solution for increasing retrieval accuracy. However, in this case, it will not be appropriate to use it in real-time systems since the access speed will slow down considerably. Besides, access speed in large datasets is a genuine problem. Image access codes should be as
Experiments and experimental results
The proposed method is trained on a computer with Intel Core i7−7700 K CPU (4.2 GHz), 32 GB DDR4 RAM, and NVIDIA GeForce GTX 1080 graphic card.
Discussion
A system retrieval time usually depends on the length of the feature vector. In other words, the hash depends on the number of bits of the code. The retrieval process performed with longer hash codes produces a higher performance precision score, while the access speed is slower. While shorter hash codes reach results relatively quickly, average precision performance may not be satisfactory. Comparing retrieval speeds using the same length hash codes is inefficient. Because almost the same
Conclusion
In this study, a framework based on high-level deep features is presented for CBMIR. Due to the difficulty of finding a medical dataset containing a sufficient number of tagged images, it is often necessary to retrieve such datasets with hand-crafted features. But hand-crafted features do not reduce the semantic gap. High-level features must be obtained for the reduction of the semantic gap. The most critical obstacle for this is the small number of images. A class-driven retrieval approach is
CRediT authorship contribution statement
Şaban ÖZTÜRK: Conceptualization, Software, Methodology, Resources, Data Curation, Validation, Formal analysis, Investigation, Writing, Visualization.
Human and animal rights
The paper does not contain any studies with human participants or animals performed by any of the authors.
Acknowledgment
This research is funded by Scientific and Technological Research Council of Turkey (TÜBİTAK) under grant number 120E018.
Declaration of Competing Interest
The authors declare that they have no conflicts of interest.
References (42)
Clinical applications of nuclear medicine in the diagnosis and evaluation of musculoskeletal sports injuries
Rev. Española de Med. Nucl. e Imagen Mol. (English Edition)
(2020)- et al.
A novel biomedical image indexing and retrieval system via deep preference learning
Comput. Methods Programs Biomed.
(2018) Stacked auto-encoder based tagging with deep features for content-based medical image retrieval
Expert Syst. Appl.
(2020)- et al.
Image moment invariants as local features for content based image retrieval using the Bag-of-Visual-Words model
Pattern Recognit. Lett.
(2015) - et al.
Mammogram classification using two dimensional discrete wavelet transform and gray-level co-occurrence matrix for detection of breast cancer
Neurocomputing
(2015) - et al.
A sequential search-space shrinking using CNN transfer learning and a Radon projection pool for medical image retrieval
Expert Syst. Appl.
(2018) - et al.
A review on CT image noise and its denoising
Biomed. Signal Process. Control
(2018) - et al.
CT image denoising using locally adaptive shrinkage rule in tetrolet domain
J. King Saud Univ. Comp. Inform. Sci.
(2018) - et al.
Independent component analysis: algorithms and applications
Neural Netw.
(2000) - et al.
Transfer learning with stacked reconstruction independent component analysis
Knowledge Based Syst.
(2018)
Supervised deep hashing for scalable face image retrieval
Pattern Recognit.
Local bit plane adjacent neighborhood dissimilarity pattern for medical CT image retrieval
Proced. Comput. Sci.
A new approach for effective retrieval and indexing of medical images
Biomed. Signal Process. Control
A new exponentially directional weighted function based CT image denoising using total variation
J. King Saud Univ. Comput. Inform. Sci.
CT image denoising using multivariate model and its method noise thresholding in non-subsampled shearlet domain
Biomed. Signal Process. Control
Content-based image retrieval using color, shape and texture descriptors and features
Arab. J. Sci. Eng.
Content-based image retrieval and feature extraction: a comprehensive review
Math. Probl. Eng.
An overview of approaches for content-based medical image retrieval
Int. J. Multimed. Inf. Retr.
Segmentation of magnetic resonance brain image: integrating region growing and edge detection
Presented at the Proceedings., International Conference on Image Processing
Shape feature extraction using fourier descriptors with brightness in content-based medical image retrieval
Presented at the 2008 International Conference on Intelligent Information Hiding and Multimedia Signal Processing
Image retrieval with rotation invariance
Presented at the 2011 3rd International Conference on Electronics Computer Technology
Cited by (37)
Content-based medical image retrieval using deep learning-based features and hybrid meta-heuristic optimization
2024, Biomedical Signal Processing and ControlSmall object detection using deep feature learning and feature fusion network
2024, Engineering Applications of Artificial IntelligenceEnhancing patient education in cancer care: Intelligent cancer patient education model for effective communication
2024, Computers in Biology and MedicineContent-based medical image retrieval with opponent class adaptive margin loss
2023, Information SciencesDetecting the modality of a medical image using visual and textual features
2023, Biomedical Signal Processing and ControlBalance label correction using contrastive loss
2022, Information SciencesCitation Excerpt :Modern deep neural networks (DNNs) have achieved remarkable success in many machine learning tasks [1,2].