Elsevier

Computers in Biology and Medicine

Volume 90, 1 November 2017, Pages 68-75
Computers in Biology and Medicine

No-reference quality index for color retinal images

https://doi.org/10.1016/j.compbiomed.2017.09.012Get rights and content

Highlights

  • No-reference wavelet-based sharpness index to assess retinal image quality.

  • Index modified by homogeneity parameter to account for illumination nonuniformity.

  • Evaluation performed using two datasets of different resolutions and degree of blur.

  • Multi-class classification showing high micro average F-measure of 0.84 and 0.95.

  • Strong and highly significant correlation between quality index and experts scores.

Abstract

Retinal image quality assessment (RIQA) is essential to assure that the images investigated by ophthalmologists or automatic systems are suitable for reliable medical diagnosis. Measure-based RIQA techniques have several advantages over the more commonly used binary classification-based RIQA methods. Numeric quality measures can aid ophthalmologists in associating a degree of confidence to the diagnosis performed through the investigation of a certain retinal image. Moreover, a numeric quality index can provide a mean for identifying the degree of enhancement required as well as to evaluate and compare the improvement achieved by enhancement techniques. In this work, a no-reference retinal image sharpness numeric quality index is introduced that is computed from the wavelet decomposition of the images. In order to account for the obscured retinal structures in unevenly illuminated image regions, the quality index is modified by a homogeneity parameter calculated from the previously introduced retinal image saturation channel. The proposed quality index was validated and tested on two datasets having different resolutions and quality grades. A strong (Spearman's coefficient > 0.8) and statistically highly significant (p-value < 0.001) correlation was found between the introduced quality index and the subjective human scores for the two different datasets. Moreover, multiclass classification using solely the devised retinal image quality index as a feature resulted in a micro average F-measure of 0.84 and 0.95 using the high and low resolution datasets, respectively. Several comparisons with other retinal image quality measures demonstrated superiority of the proposed quality index in both performance and speed.

Introduction

Retinal images are being increasingly used by ophthalmologists as well as in automatic systems for medical diagnosis and follow-up of retinal diseases such as diabetic retinopathy, glaucoma, and age-related macular degeneration. However, some retinal images can be unsuitable for reliable medical analysis and diagnosis due to their insufficient quality, most commonly due to reduced sharpness and/or uneven illumination. Poor quality retinal images can occur due to several factors including inadequate imaging conditions (e.g., insufficient illumination, poor focus) or patient related issues (e.g., pupil dilation, patient fixation, media opacity) [1], [2]. Several studies have shown that practical datasets can include a large number of poor quality retinal images that can be as high as 60% of the total images [3], [4].

Various shortcomings can result from employing bad quality retinal images in medical investigations. If a poor quality retinal image is not immediately identified, a recapture would be requested by the ophthalmologist costing himself and the patient both time and money, especially in cases when the imaging procedure and medical facility are in distant locations. Nevertheless, a worse scenario can occur if automatic screening systems are used for the medical diagnosis. Automatic retinal screening systems capture and process retinal images without any human intervention. Based on this analysis, the patient is advised to or against the need for further physical investigation depending on whether or not early disease symptoms were detected in the processed images. If poor quality retinal images are used within the automatic systems, misdiagnosis could occur leading to delayed treatment. As a result, further disease progressions can occur causing irreversible visual impairments that could lead to blindness.

No-reference retinal image quality assessment (RIQA) algorithms are being increasingly integrated as a preliminary preprocessing step in medical retinal image analysis in order to assure reliability of the performed diagnosis. RIQA algorithms automatically determine whether the captured images are suitable for reliable medical analysis in the absence of a gold standard reference image. Good quality retinal images are then stored and passed on for further processing, whereas poor quality images are either enhanced, if possible, to improve their quality or discarded altogether and an image recapture is performed. Generally, RIQA algorithms can be categorized into classification- and measure-based methods [5].

Classification-based RIQA algorithms rely on supervised learning methods to classify images into a specific quality class. Binary classification-based RIQA methods are very widely implemented in literature [3], [4], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15] where retinal images are considered to be either of excellent quality making them readily suitable for direct medical investigation or of severely poor quality rendering them unsuitable for medical diagnosis. However, captured retinal images may be of adequate quality which would require the application of proper image enhancement techniques (e.g., sharpening, luminosity improvement) before they are suitable for medical analysis. In this study, a practical dataset manually graded by human experts into good, adequate, and poor quality was found to have ∼45% of its images of adequate quality. In a previous work, Katuwal et al. [16] presented a multi-classification RIQA algorithm that categorized images into one of five different quality groups using several features related to the symmetry of the wavelet segmented blood vessels. The micro-averaged F-measure was found to be 0.6. The relatively low F-measure was attributed to the close similarity between the neighboring quality classes leading to increased misclassifications.

Measure-based RIQA algorithms compute a numeric quality index that is related to the quality of the retinal image. Numeric measures can help ophthalmologists associate a certainty degree to the diagnosis performed through inspection of a certain retinal image based on the value of the image's quality index [17]. Furthermore, numeric quality measures can facilitate evaluating the necessity and effectiveness of image enhancement methods performed to improve an image's quality making it more suitable for reliable medical diagnosis [17]. As a result, more adequate quality retinal images can be efficiently used, after proper enhancement, in medical analysis and diagnosis thus reducing the need for increased image recaptures. Numeric-based quality metrics have been commonly implemented in several fields including quality assessment of satellite images [18], stereoscopic images [19], [20], [21], underwater images [22], [23], generic images [24], [25], as well as for the evaluation of retinal image registration algorithms [26]. However, only few measure-based RIQA approaches exist in literature.

In the early work of Lee and Wang [17], a quality index was assigned to a test retinal image based on the similarity between its histogram and a template histogram created from a group of excellent quality retinal images. Although it was mentioned that their quality measure agreed with human perception, no correlation measure was given to quantify the degree of this agreement. In a more recent work, Bartling et al. [27] introduced a retinal image quality measure computed as the product of wavelet-based sharpness and illumination features. Retinal images were then classified into one of three quality categories based on intervals of the presented quality measure. The kappa value between the automatic evaluation and six human graders was found to be in the range of [0.52, 0.68]. Köhler et al. [5] presented a numeric RIQA index based on an adaptation of the general approach in which the retinal vessel tree was taken as guidance to determine a global quality score from local estimates in anisotropic patches [28]. Spearman's rank correlation coefficient between their proposed measure and each of the peak-signal-to-noise ratio (PSNR) and the structural similarity (SSIM) [29] full reference metrics was found to be 0.89 and 0.91, respectively. However, the PSNR has been reported to be inefficient in matching human judgement of image quality [30], [31] whereas the SSIM index was shown to be less competitive in assessing image quality related to its sharpness [32]. Measure-based retinal image quality techniques are thus still in their early stages which is indicated by the limited research and the relatively low statistical correlation, if any was given, between the existing measures and human experts.

In this work, a no-reference wavelet-based quality index is introduced to assess the quality of color retinal images based on their overall sharpness. Wavelet transform separates the sharpness and background information of an image in its detail and approximation subbands, respectively. Wavelet transform thus has the advantage of being consistent with the theory of human visual processing indicating that the eye has different optical paths for high and low frequencies [33]. Furthermore, wavelet multiresolution tends to bring out finer image details in the subsequent wavelet levels which were shown in previous work by Nirmala et al. [34] to be related to the different retinal structures. Recently, wavelet-based RIQA algorithms have been introduced to overcome the limitations in the more commonly utilized generic and segmentation based methods features by considering information related to the retinal structures and being computationally inexpensive while giving superior results [35]. Moreover, it was shown that transform-based retinal image quality features can be adapted to maintain reliable performance in practical scenarios in which the train and test datasets had significantly different image resolutions [36].

The introduced wavelet-based quality index is modified using a homogeneity parameter to account for reduced retinal image quality due to hidden structures in unevenly illuminated regions, thus increasing the reliability of the introduced measure. The homogeneity parameter was computed from the retinal saturation channel (Sretina) which was previously introduced by the authors to assess retinal image homogeneity [37].

In order to validate and test the proposed retinal image quality index, several analyses were performed. Two manually graded retinal image quality datasets having different resolutions, number of quality groups, and degree blurring of the bad quality images were included in these analyses. Initially, Spearman's rank correlation coefficient between the introduced quality index and the human graders was computed for the different datasets. Next, the retinal image quality index was input to a classifier to categorize the images into good, adequate, and bad quality retinal images. Finally, the presented quality index was employed to compare the performance of contrast limited adaptive histogram equalization (CLAHE), which is commonly implemented for retinal image contrast enhancement [38], [39], when it is applied to different color models. Several comparisons with other retinal image quality measures from literature are also presented showing superiority of the proposed index in reliability, performance, and computational time.

The rest of the paper is organized as follows: Section 2 summarizes the datasets used for devising and validating the proposed retinal image quality index. Section 3 presents the details of the retinal image quality measure and the theory behind its formation. Section 4 includes the results and discussion of the statistical tests and classification experiments performed to validate the introduced quality index. Moreover, several comparisons with other quality measures from literature are presented. Finally, Section 5 summarizes the conclusions of the presented work.

Section snippets

Materials

Several quality graded datasets were included in this study. The details of these datasets are as follows:

Dataset-1 (DS1) [40]: consists of 301 optic disc centered retinal images acquired with a Kowa nonmyd fundus camera and having a resolution of 1600 × 1212 pixels. Three human graders evaluated the quality of the images, then a majority vote determined their final quality class resulting in 236 good and 65 bad quality retinal images.

Dataset-2 (DS2): includes 190 optic disc centered retinal

Methods

Good quality retinal images should be sharp and evenly illuminated to facilitate the detection of retinal structures and possible disease lesions by both ophthalmologists and automatic systems. Wavelet transform has the ability to separate the sharpness information related to different retinal structures within the detail subbands whereas the low frequency information resides in the approximation subband. The introduced retinal image quality measure is hence computed from the wavelet

Results & discussion

Several experiments were performed to validate the efficiency of the introduced retinal image quality measure. Initially, the Spearman's rank correlation coefficient was computed to estimate the strength of association between the introduced quality index (Qrm) and the human graders. The proposed quality index was then used to classify retinal images into different quality groups and classification results are reported. Next, Qrm was calculated to compare the improvement induced in retinal

Conclusions

Numeric-based retinal image quality measures have several advantages over the more widely implemented classification-based approaches. Ophthalmologists can benefit from numeric quality measures in asserting a degree of confidence to the diagnosis performed based on examining a certain retinal image. Furthermore, automatic screening systems as well as imaging technicians can use the numeric quality index as a guidance to determine the degree of required image enhancement as well as to measure

Conflict of interest

None Declared.

References (49)

  • W. Kusakunniran

    Automatic quality assessment and segmentation of diabetic retinopathy images

  • D. Mahapatra

    Retinal image quality classification using saliency maps and CNNs

  • B. Remeseiro et al.

    Objective quality assessment of retinal images based on texture features

  • J.M.P. Dias et al.

    Retinal image quality assessment using generic image quality indicators

    Inf. Fusion

    (2014)
  • U. Şevik

    Identification of suitable fundus images using automated quality assessment methods

    J. Biomed. Opt.

    (2014)
  • D. Veiga

    Quality evaluation of digital fundus images through combined measures

    J. Med. Imaging

    (2014)
  • M. Fasih

    Retinal image quality assessment using generic features

  • S. Wang

    Human visual system-based fundus image quality assessment of portable fundus camera photographs

    IEEE Eng. Med. Biol. Soc.

    (2016)
  • G.J. Katuwal

    Automatic fundus image field detection and quality assessment

  • S. Lee et al.

    Automatic retinal image quality assessment and enhancement

  • A. Samani et al.

    No-reference quality metrics for satellite weather images and sky images

  • Y. Lin

    Quality index for stereoscopic images by jointly evaluating cyclopean amplitude and cyclopean phase

    IEEE J. Sel. Top. Signal Process.

    (2017)
  • H. Wang

    No-reference stereoscopic image-quality metric accounting for left and right similarity map and spatial structure degradation

    Opt. Lett.

    (2016)
  • K. Panetta et al.

    Human-visual-system-inspired underwater image quality measures

    J. Ocean. Eng.

    (2016)
  • Cited by (0)

    View full text