Keywords

1 Introduction

The occurrence of diabetes is increasing globally at an accelerating rate. Diabetic Retinopathy (DR) contributes significantly as some of the main causes of vision loss, if it is not diagnosed and managed properly. In order to minimize the risk of blindness happening caused by the diabetic retinopathy, diabetes patients should control the blood sugar levels, blood pressure and cholesterol, in addition to undergoing regular eye screening.

Diabetic retinopathy and maculopathy screening helps identify high-risk individuals of having sight impairment. Therefore, an effective screening of diabetic retinopathy is essential for early action, as well as in the preventive management of diabetic complications. Moreover, screening is able to detect eye problems before starting to interfere with our vision, and the treatment can help prevent or reduce vision loss if the problems are caught early. The retinal screening helps give information about the condition progression, and determine the treatment type if the signs of diabetic retinopathy or maculopathy are detected.

A timely and complete eye examination that comprises dilated ophthalmoscopy or high quality fundus images assessment in patients without previous treatment of DR or other eye disease are the accepted methods in screening [1]. Eye fundus photography is most frequently used in clinical studies, and widely used for telemedicine and patient education as well. Moreover, fundus photography offers a colour or red-free image, and provides many advantages compared with the predecessor, colour photographic film [1]. Nowadays, digital retinal imaging is widely used as it provides high-resolution, faster images and easily image enhancement amenability. In diabetic retinopathy screening, the features of DR are characterized in order to demonstrate more precise details than clinical examination. Modifications of conventional imaging techniques and new developments in the area, technology innovations, such as automated image interpretation, large data sets usage and mobile applications, will improve the pathogenesis of DR [1]. In addition, Baumal et al. [1] suggest that task management and longitudinal treatment help prevent vision loss and recover a significant amount of vision in the patients.

Diabetic retinopathy is a complex disease with diverse clinical findings. Among the diabetic retinopathy signs are microaneurysms, haemorrhages, exudates and neovascularisation. This paper is focusing on the diabetic retinopathy and maculopathy detection. The yellow lesions found near the macula (also a disease of the macula) is termed as maculopathy. The macula, which is the centre of the retina, functions as a central mechanism that provides our vision. The macula area is considered as a very sensitive area, where the centre of the macula, called fovea, is a tiny area that is accountable for both detailed and colour vision [2]. Thus, the detection of maculopathy is vital as the loss of vision happening at the fovea part causes blindness. The presence or absence of the maculopathy condition will decide the requirement of appropriate treatment or referral. The referral to the ophthalmologist is assigned if maculopathy is detected. On the other hand, if maculopathy is not present, referral is not necessary and the screening will be repeated in a one-year period. The combination of diabetic retinopathy and maculopathy detection, therefore, is important in order to assist the diabetic retinopathy screening management. Figure 1 represents the eye fundus image with diabetic retinopathy, showing maculopathy in colour image.

Fig. 1.
figure 1

Maculopathy representation in colour image [1] (Color figure online)

A variety of ways and solutions have been proposed by researchers working and focusing on the maculopathy detection, in order to detect and classify the fundus images into different stages of maculopathy, such as mild, moderate and severe maculopathy [34,35,36,37,38,39]. However, in this paper, the research work proposed other incorporation mechanisms of diabetic retinopathy and maculopathy, where the detection of both are based on the diabetic retinopathy signs discovery and also following ophthalmologists’ practice. The severity level reported in the literature is based on the diabetic retinopathy features detection, rather than the maculopathy severity. The proposed classification refers to whether or not maculopathy is present, i.e., with maculopathy and without maculopathy. As a result, the new cases are: No Diabetic Retinopathy (DR), Mild DR with/without maculopathy, Moderate DR with/without maculopathy, Severe DR with/without maculopathy, Proliferative DR with/without maculopathy and Advanced Diabetic Eye Disease (ADED). This categorization is beneficial because two important detection can be identified in only one process of screening. Furthermore, the urgency of the referral, which should happen within four weeks, as proposed by the National Institute for Clinical Excellence [3], is applied to those who have any form of maculopathy, regardless of mild, moderate or severe levels. In this case, the severity of the maculopathy is therefore not significant, provided that its presence or absence has been determined. Therefore, this paper presents a novel development of diabetic retinopathy alongside maculopathy detection, by introducing effective image pre-processing techniques in conjunction with deep learning for classification. The new system has been tested on a new developed database collected from Melaka Hospital, Malaysia.

The paper is organized as follows. Section 2 presents some previous related work on automated methods for the detection of diabetic retinopathy, comprising of developed diabetic retinopathy detection systems with deep learning as well as developed maculopathy detection systems. Section 3 explains the proposed system for the diabetic retinopathy detection alongside maculopathy in eye fundus images, by implementing effective image pre-processing and deep learning techniques. Finally, Sect. 4 presents some conclusions and future work.

2 Related Previous Work

There are some developed automated systems reported in literature to detect and diagnose diabetic retinopathy. In addition, some researchers proposed the detection of maculopathy to support the management of diabetic retinopathy. It can be summarized that various techniques and methods were proposed for the image pre-processing, feature extraction and classification phases in order to produce reliable diabetic retinopathy detection systems.

2.1 Diabetic Retinopathy Detection

For the detection of diabetic retinopathy purpose, various machine learning techniques and methods were proposed, used and reported in the literature [4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39]. Meanwhile, deep learning has been used in diabetic retinopathy detection in [10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25]. However, the reported detection systems were concentrating on the diabetic retinopathy as general detection, and also on the diabetic retinopathy signs detection, using various machine learning methods, and deep learning among them.

In our earlier work, a basic automated system for general diabetic retinopathy detection, employing a combination of non-fuzzy techniques, has been proposed initially [26]. Following this, we investigated the capability of different fuzzy image processing techniques for the detection of diabetic retinopathy and maculopathy in retinal images in [27], which was enhanced in [28] with retinal structures’ segmentation. Different machine learning techniques were used for the classification part to categorize the images into more detailed classes of the disease. The results show that employing fuzzy image processing in addition to the retinal structure localization and extraction can help produce a more reliable diabetic retinopathy screening system. Therefore, the proposed system in this paper, introducing a combination of techniques for the image pre-processing part as well as the deep learning, for better classification of diabetic retinopathy and maculopathy detection.

Data augmentation is implemented to artificially enlarge the datasets to overcome the shortcomings of using small image datasets, reducing overfitting on the image data and increasing the algorithm performance. Data augmentation has been used in the automated detection of diabetic retinopathy systems using deep learning. Lam et al. [10] implemented random augmentation of images for an automated detection of diabetic retinopathy in order to improve the capability of network localization and also reduce overfitting. Among the data augmentation techniques implemented were random zeros padding, zoom, rolling and rotation. Meanwhile, five different transformation types, which are rotation, flipping, shearing, rescaling and translation have been proposed by Xu et al. in [12] for the automatic classification of diabetic retinopathy using deep convolutional neural networks. Other augmentation methods, which are duplication and rotation with several degree angles, were implemented in [15]. Rakhlin [13] implemented some pre-processing techniques on the retinal images for the diabetic retinopathy detection with the integration of deep learning classification, including normalization, scaling, centering and cropping. Pratt et al. [16] implemented data augmentation, where each image was randomly rotated within the range of 0 to 90°, randomly horizontally and vertically flipped and also randomly horizontally and vertically shifted on the pre-processed images, for diabetic retinopathy detection and classification. Besides the basic transformations, other pre-processing based augmentations for color enhancements could also be used for the data augmentation. Ghosh et al. [17] proposed the adjustment of image brightness, followed by rotations of 90 and 180°, which were eventually able to increase the class size six times and adapt to different orientations and lighting conditions.

2.2 Maculopathy Detection

The localisation and the detection of both the macula and the fovea are essential in identifying maculopathy. The lesions in the macula region generate maculopathy, while the fovea is located at the centre of the macula. Some automatic localization and detection of the macula in digital eye fundus images have been proposed in [30,31,32,33,34]. Meanwhile, the detection of diabetic maculopathy in retinal images have been investigated in [34,35,36,37,38,39].

Tariq et al. [36] developed an automated detection and grading system of diabetic maculopathy using digital eye fundus images. The proposed system involves pre-processing, exudates and macula detection, some feature extraction, and finally the classification stage using a Gaussian Mixture Model classifier, where the input image was graded into three categories: healthy, non-clinically significant macular edema, and clinically significant macular edema. The same diabetic maculopathy classification (normal, non-clinically significant macular edema, and clinically significant macular edema), as proposed by Tariq et al. [36], was also used by Chowriappa et al. in [39]. The proposed system extracted the textural features and classified the eye fundus images into their classes of disease severity using four ensemble classifiers, employing the tree-based J48, naïve Bayes, sequential minimal optimization and also the hidden naive Bayes classifiers.

Meanwhile, a computer system for the purpose of the detection of diabetic maculopathy in human eye fundus images, employing morphological operations, was proposed by Vimala and Kajamohideen in [34]. The green component from the colour input image was extracted, followed by median filtering and also contrast limited adaptive histogram equalization techniques. The macula detection was obtained by employing top-hat transform and bottom hat transform techniques. Some colour and also texture features were extracted to grade the pre-processed image into two classes: exudates present or exudates absent, using a Support Vector Machine as classifier. Punnolil [35] presents the diagnosis system of diabetic maculopathy severity by employing image pre-processing techniques (colour normalization), the detection of optic disc, both macula and fovea localization, and then detecting exudates and hemorrhages. After these, several features were extracted and classified into the maculopathy severity grading (normal, mild, moderate and severe) using Support Vector Machine as classifier.

Siddalingaswamy and Prabhu [37] proposed a system of automatic grading of diabetic maculopathy severity level. The developed system initially performed the green component extraction, optic disc detection, fovea and macular region detection, then the detection of hard exudate lesions using mathematical morphological and clustering techniques. The level of maculopathy severity is classified as normal, mild, moderate and also severe, based on the exudates location in marked macular region. Another automated computer-based system for maculopathy diagnosis in diabetic retinopathy screening was presented by Hunter et al. in [38], where the detection and filtering of candidate lesions, extraction of features and classification by a multilayer perceptron were implemented.

In summary, the detection of maculopathy is really important because the untreated affected macula will eventually contribute to the loss of vision. Therefore, for this challenging problem, some researchers are currently contributing and proposing solutions for the detection of maculopathy in retinal images. However, in the previously reported maculopathy detection systems, image augmentation based techniques have not been implemented during the pre-processing stage in conjunction with deep learning for the classification stage. Therefore, this paper presents a novel development of diabetic retinopathy and maculopathy detection system based on such a combination of techniques.

3 Proposed Approach

In this paper, a deep learning approach that utilizes “on-the-fly” data augmentation techniques is proposed. The combination of normal and diabetic retinopathy eye fundus images from a novel dataset, which was collected from the Eye Clinic, Department of Ophthalmology, Melaka Hospital, Malaysia, is used to evaluate the model. The new dataset, with a total of 600 colour eye fundus images, contains images of size 3872 × 2592 pixels saved in JPEG format. The dataset is presented in detail in [29].

Through the proposed approach, input retina images are first pre-processed to facilitate the classification process. After that, the processed images are fed into a deep convolutional neural network (DCNN) for classification. This DCNN is trained using augmented images of the retina with varying levels of retinopathy and maculopathy.

3.1 Image Pre-processing

All input images, from both the training and testing sets, are reduced in size from their original 3872 × 2592 pixels to 242 × 162 pixels. This reduction was to maximize the performance of the model, while preserving as many features as possible from the original image. The aspect ratio of the images was also preserved to maintain the original shapes and spatial features contained in the original images.

3.2 Deep Convolutional Neural Network

A DCNN model was designed to classify the input images into their respective classes. The model starts with an image input layer, where the image size is specified, which in this case is 3 × 162 × 242. Following that, four convolutional layers are implemented, where each is followed by a 2-dimensional batch normalization layer, a rectified linear unit (ReLU) and a max-pooling layer. The filter sizes are gradually reduced through the four layers to reduce the inputs (3 × 162 × 242) into (9 × 14 × 128). The specific parameters used for each layer can be found in Fig. 2. Following the last convolutional layer, the features are flattened into a fully connected layer, which also implements a ReLU activation function:

Fig. 2.
figure 2

Convolutional neural network structure used

$$ {\text{f}}\left( {\text{x}} \right) = { \hbox{max} }\,\left( {0,{\text{x}}} \right) $$
(1)

This layer is then connected to an output layer that has a varying number of neurons, depending on the categorization (shown in Table 1). The last classification layer implemented a SoftMax activation function that is tasked with producing the probabilities of each image belonging to a specific class.

Table 1. Different categorizations

3.3 Training

Following the image pre-processing stage, the model was trained using the resized images. During the training stage, on-the-fly data augmentation was implemented to enhance the number of training examples in the dataset. This stage helps with preventing the model from overfitting, while also helping with “calibrating” the high number of parameters in deep models, such as the one used in this work. The implemented data augmentation techniques are exactly aimed at the nature of retina images, where unlike natural scene images, the images are more standardized in terms of contrast and angles. Therefore, random rotation was the only data augmentation technique that was implemented with the resized images rotated randomly by an angle between −20 and 20°.

The model was trained separately for the three different class taxonomy categories shown in Table 1. The first category splits into two main cases: “no diabetic retinopathy” and “diabetic retinopathy”. The categorization based on maculopathy detection is the second one, classifying into two other cases: “maculopathy detected” and “maculopathy not detected”. The third categorization, representing the experts’ original classification, provides more details and involves ten stages of retinopathy.

All three models were trained using the Cross Entropy Criterion to calculate the error at the last classification layer. The final error for the model is calculated using the cross entropy function (C):

$$ C = \mathop \sum \limits_{i = 1}^{N} \mathop \sum \limits_{j = 1}^{K} t_{ij} \ln y_{ij} $$
(2)

where N is the total number of images, K is the number of classes, \( t_{ij} \) is the indicator that sample \( i \) belongs to class \( j \), and \( y_{ij} \) is the model’s output for sample \( i \) for class \( j \). Stochastic gradient descent with momentum (SGDM) was used for optimization with an initial learning rate of 0.0001 and 0.9 momentum. Each model was trained for a total of 100 epochs using the 70% random split of the dataset that form the training set. For each of the three models, two different variants were trained, one with and another without data augmentation.

3.4 Results

All trained models were tested using the same test set without any alteration other than the resizing. A summary of the classification accuracies for the different categories is shown in Table 2. The effect of the proposed data augmentation techniques are apparent with the enhanced performance in all categories. This means that even considering the small number of training samples in the dataset, our proposed classification models were able to achieve reasonable accuracies. It is also clear that the classifier trained for two classes was able to achieve the highest accuracy because each class has comparably higher number of samples than the other category.

Table 2. Summary of results

Meanwhile, for more clarity, the generated confusion matrix is presented in Fig. 3(a) and (b) to show the relative performance of the classifier. The confusion matrix for both variant models of categorization I and categorization II show that the sensitivity value (the percentage of abnormal images which have been classified as abnormal) is higher than the specificity (the percentage of normal images classified as normal). The sensitivity and the specificity values for both variants of the second categorization are similar. The models for the first and second categorizations show that the classification accuracy model with data augmentation is higher or similar with that of the model without data augmentation. However, for the third categorization, the classification accuracy for the model without data augmentation is higher than that of the model with data augmentation. This happened due the fact that the categorization III provides a hugely imbalanced number of images for some cases, particularly for the severe cases of DR. Although categorization II (maculopathy detection) provides an imbalanced classification between the two main cases, the classification accuracy for categorization II is the highest among the three categorization. The model was able to detect the maculopathy presence well, as the maculopathy can be seen clearly from the quality images provided, and, therefore, the model was capable to differentiate the severity of maculopathy lesions in the eye fundus images.

Fig. 3.
figure 3figure 3

(a, b) Confusion matrices

It can be concluded that using balanced or near balanced datasets, and suitable data augmentation otherwise, help increase the classification accuracy. These two factors should be considered in the development of better detection and classification models.

4 Conclusions and Future Work

An approach for the detection of diabetic retinopathy and maculopathy in colour eye fundus images implementing data augmentation techniques and deep learning has been proposed in this paper. In summary, it is challenging to detect the diabetic retinopathy and maculopathy, particularly using a small and imbalanced dataset for classification. The use of image pre-processing techniques for the data augmentation helped improve the classification performance. The classification models can be further enhanced by employing different image augmentation techniques or different combinations of pre-processing techniques, including fuzzy techniques, as in our previous work [27,28,29], such as fuzzy transform, fuzzy histogram equalization, fuzzy filtering, etc. In addition, the retinal structures segmentation, such as the extraction of blood vessels and the localization of the optic disc can be implemented in order to increase the maculopathy detection performance.

Deep learning models have been deployed for the classification of diabetic retinopathy and maculopathy classification. Problem specific data augmentation was implemented to overcome the different challenges presented by the classification task. The proposed models classify the input images into three different taxonomies of classes, for the purpose of generating a diversity of results and performance analysis. The three types of classification consist of two types of 2-class classification and one type of 10-class classification. The classification can be enhanced by using another categorization involving four cases: no retinopathy class, non-proliferative diabetic retinopathy (mild, moderate and severe cases) class, proliferative diabetic retinopathy class and finally the advanced diabetic eye disease class. Pre-trained image classification networks also should be considered, as they have already been trained to identify specific visual features that could be useful when generalized on different tasks such as this. Additionally, different parameters and further exploration of deep learning architectures should be performed in order to generate better classification and eventually yield a more reliable and accurate detection of diabetic retinopathy and maculopathy. A future aspect to investigate is due to a big problem, which is that deep learning methods turn out to be difficult to interpret for humans, which create serious challenges, including that of interpreting a predictive result when it may be confirmed as incorrect [40].