Glaucoma Detection from Raw SD-OCT Volumes: A Novel Approach Focused on Spatial Dependencies

https://doi.org/10.1016/j.cmpb.2020.105855Get rights and content

Highlights

  • Implementation of a CNN-LSTM-based approach intended to glaucoma detection from raw spectral-domain optical coherence tomography (SD-OCT) volumes.

  • A new algorithm based on morphological operations to extract the region of interest of each B-scan of the SD-OCT cube.

  • New slide-level feature extractor (RAGNet) composed of two innovative modules: Convolutional blocks included via residual connections in parallel with finetuned architectures of the state of the art. Attention module with skip-connections intended to refine the features embedded in the latent space.

  • Development of a sequential-weighting module (SWM) operating on the LSTM outputs to provide a holistic feature vector before the top model that makes possible to provide a more stable and higher quality learning of the models.

  • Design of two new metrics (OVFT and QLTY) to measure the overfitting and the quality of the validation learning curves, respectively.

  • Computation of the class activation maps to extract slide-level heat maps to elucidate the most interesting regions of the B-scans from a volume-based prediction.

  • Outperforming of the proposed method based on a combination of CNN and LSTM networks compared with the 3D architectures of the state-of-the-art studies.

Abstract

Background and objective:Glaucoma is the leading cause of blindness worldwide. Many studies based on fundus image and optical coherence tomography (OCT) imaging have been developed in the literature to help ophthalmologists through artificial-intelligence techniques. Currently, 3D spectral-domain optical coherence tomography (SD-OCT) samples have become more important since they could enclose promising information for glaucoma detection.

To analyse the hidden knowledge of the 3D scans for glaucoma detection, we have proposed, for the first time, a deep-learning methodology based on leveraging the spatial dependencies of the features extracted from the B-scans.

Methods:The experiments were performed on a database composed of 176 healthy and 144 glaucomatous SD-OCT volumes centred on the optic nerve head (ONH). The proposed methodology consists of two well-differentiated training stages: a slide-level feature extractor and a volume-based predictive model. The slide-level discriminator is characterised by two new, residual and attention, convolutional modules which are combined via skip-connections with other fine-tuned architectures. Regarding the second stage, we first carried out a data-volume conditioning before extracting the features from the slides of the SD-OCT volumes. Then, Long Short-Term Memory (LSTM) networks were used to combine the recurrent dependencies embedded in the latent space to provide a holistic feature vector, which was generated by the proposed sequential-weighting module (SWM).

Results:The feature extractor reports AUC values higher than 0.93 both in the primary and external test sets. Otherwise, the proposed end-to-end system based on a combination of CNN and LSTM networks achieves an AUC of 0.8847 in the prediction stage, which outperforms other state-of-the-art approaches intended for glaucoma detection. Additionally, Class Activation Maps (CAMs) were computed to highlight the most interesting regions per B-scan when discerning between healthy and glaucomatous eyes from raw SD-OCT volumes.

Conclusions:The proposed model is able to extract the features from the B-scans of the volumes and combine the information of the latent space to perform a volume-level glaucoma prediction. Our model, which combines residual and attention blocks with a sequential weighting module to refine the LSTM outputs, surpass the results achieved from current state-of-the-art methods focused on 3D deep-learning architectures.

Introduction

Glaucoma is a group of progressive optic neuropathies that affects the optic nerve causing several visual field defects and structural changes [1]. Nowadays, this chronic disease is the leading cause of blindness worldwide [2], with a number of estimated cases of 111.8 million in 2040, according to [3]. Early diagnosis of glaucoma is essential for timely treatment in order to avoid the irreversible vision loss [2]. Currently, there is no single accurate test to certify the glaucoma diagnosis, so the procedure includes a lot of hardworking tests such as pachymetry (to measure the thickness of the cornea), tonometry (to assess the intraocular pressure), visual field tests and a subjective examination and interpretation of optical features from different experts who often disagree [4]. In this context, techniques based on image analysis like fundus image and optical coherence tomography (OCT) have become very important for the diagnosis and management of this degenerative disease. In particular, OCT imaging modality [5] is a non-contact and non-invasive technique able to quantify several retinal structures through generating high-resolution 2D and 3D images of the retina. Ophthalmologists usually make use of these 2D-OCT images centred on the optic disc to analyse structural changes in the retinal nerve fibre layer (RNFL) and in the ganglion cell inner plexiform layer (GCIPL). Both structures are reported as useful biomarkers of glaucoma for the disease progression [6]. Otherwise, fundus image analysis is postulated as a great cost-effectiveness technique which has reported promising results in the detection of several eye-focused diseases [7], [8], [9]. However, although fundus image-based studies are cheaper than OCT, this modality is the quintessential imaging technique for glaucomatous damage evaluation [10]. This is because fundus photography is colour-dependent on the training data set and its interpretation remains subjective [11], [12], whereas OCT modality can provide reproducible and objective measurements of optic nerve head (ONH) and RNFL thickness [13]. Besides, glaucoma disease is evident in the deterioration of the cell layer around the optic disc, which is very hard to distinguish in the 2D projection of the fundus images. Therefore, since OCT imaging modality allows focusing on the depth axis to identify structural retinal changes, glaucoma disease can be easier detected via OCT, instead of fundus image. Furthermore, OCT system can provide high-resolution three-dimensional images of the macula and ONH in the spectral domain (SD), which emerges as a powerful tool for detecting glaucoma [10]. However, due to around 30 million of OCT scans are acquired each year, experts rarely scroll through the entire cube because it supposes a workload difficult to face [14]. For this reason, in this paper, we propose a promising volume-based predictive model to evidence the added value that SD-OCT volumes can provide for glaucoma diagnosis.

Many state-of-the-art studies, focusing on OCT techniques, have been proposed to address the automatic detection of glaucoma with the aim of reducing the workload and the rate of discordance between experts.

Hand-driven learning on 2D-OCT projection. Most of glaucoma diagnosis-based studies made use of 2D-OCT scans centred on the optic disc, a.k.a circumpapillary images, due to their known potential when diagnosing [15]. To the best of the authors’ knowledge, all the circumpapillary-based studies intended to glaucoma detection were performed by applying hand-driven learning methods, such as [16], [17], which required hand-crafted encoding phases before accomplishing the classification stage, e.g. segmentation of regions of interest and hand-crafted feature extraction.

Deep learning on 2D-OCT projection. Another way to address the glaucoma identification from circumpapillary images would be via deep learning, which would allow operating directly on the 2D-OCT scans without defining previous biomarkers, as we did in our previous study [18]. However, all the studies found in the literature (which apply deep-learning techniques from 2D scans) were based on fundus images [9], [19] or RNFL probability maps [20], [21] combining fundus images and OCT B-scans, but no previous studies were addressed just from circumpapillary images. This fact could be explained taking into account that researchers focused their efforts on identifying useful patterns (e.g. RNFL and GCIPL) capable of providing a tangible interpretation for the clinicians. It is the reason because many other studies were carried out for the sole purpose of segmenting the retinal layers of interest [22], [23].

Going deeper into the glaucoma detection, the real challenge today lies in the analysis of the unknown potential enclosed in the 3D-OCT scans, since specialists postulate that SD-OCT volumes hide a key knowledge that is not currently being traced due to their large associated workload. Therefore, we propose here a clinical decision support system based only on the analysis of ONH-centred cubes to claim the importance of the 3D cross-sectional information about the glaucoma diagnosis.

Hand-driven learning on 3D-OCT approach. Similarly to the 2D approximation, some studies in the literature applied hand-crafted algorithms on 3D scans to face the glaucoma discrimination [24], [25], [26]. In particular, both [24] and [25] manually extracted features related to the RNFL and the optic nerve throughout the cube. The authors proposed a similar methodology, but they tested the models on different databases. In [24], the best AUC reported was 0.877 using a random forest classifier from a database composed of 46 healthy and 57 glaucomatous patients, whereas in [25], the same researchers provided an AUC of 0.818 by applying bagging methods on a database of 48 and 62 healthy and glaucomatous patients, respectively. Another creative approach was proposed in [26], where the authors made use of a superpixel segmentation technique before addressing the feature extraction stage. They combined the features extracted from the superpixel maps with other common RNFL measurements to feed an adaptive boosting classifier. The researchers obtained an AUC of 0.855 from a database of 44 healthy and 89 glaucomatous eyes.

Deep learning on 3D-OCT approach. The use of deep-learning methods to address the glaucoma detection via SD-OCT volumes has been increased in recent times. In fact, most studies have been published during the last two years, which claims the current interest of OCT volumes for glaucoma diagnosis [27], [28], [29]. A research group from Hong Kong deserves a special mention because most of the contributions in this field come from their work. In particular, they carried out two closely similar studies, [28] and [29] to detect glaucoma by means of 3D-Convolutional Neural Networks (3D-CNNs). The main differences between them lied in the database and inclusion/exclusion criteria, as the authors concluded in [28]. Noury et al. [28] made use of a private database composed of 316 glaucomatous and 247 healthy eyes from people of different ethnicity. They developed an end-to-end classification model based on the network proposed in [30]. The researchers achieved an AUC of 0.8883 in the primary test set and this value was lower when testing external data sets. Otherwise, the authors in [29] applied similar techniques on a homogeneous database only composed of Chinese Asian people. Particularly, 2926 glaucomatous and 1961 healthy eyes. The work demonstrated good performance with an AUC of 0.969, a sensitivity of 0.89, a specificity of 0.96 and an accuracy of 0.91 when testing the primary data set. However, the results fell when the researchers assessed their network with an external database from Stanford, reaching 0.893, 0.78, 0.79 and 0.80 of AUC, sensitivity, specificity and accuracy, respectively. More recent works from the same authors [31], [32] performed a multi-output architecture by including other well-known measures (for glaucoma diagnosis) such as Visual Field Index (VFI), Mean Deviation (MD) and Pattern Standard Deviation (PSD). Specifically, a neural branch of the network was responsible for the classification between normal and glaucomatous cases, whereas the other branch was intended to regression tasks for predicting VFI, MD and PSD values. In this way, the model was fed with information from VFI, MD and PSD metrics during the backward propagation step in order to update the weights in each epoch taking into account interesting parameters associated with glaucoma disease. However, these two last studies are not comparable with our work because additional information was used besides the raw OCT volumes, unlike the works [28], [29] accomplished by the same research group. Another interesting study was carried out by IBM team in [27], where the authors made a comparison between hand-driven and data-learning approaches. They proposed a 3D-CNN architecture trained from scratch and they achieved an AUC of 0.94 in the prediction of the test set. However, it should be noted that, in this case, the experiments were performed on a significant unbalanced database, whose test set was composed of 17 healthy and 93 glaucomatous patients.

In this context, other works could be mentioned because they also applied deep-learning techniques on SD-OCT volumes, but with other purposes. For example, in [33] the researchers from Hong Kong developed a deep-learning algorithm for discriminating ungradable OCT optic disc scans. Otherwise, the authors in [14] implemented deep-learning techniques to detect specific Age-Related Macular Degeneration (AMD) patterns in the B-scans of the three-dimensional cubes. Also, De Fauw et al. in [34] applied artificial-intelligence algorithms on OCT volumes to diagnosis several retinal injuries via tissue segmentation.

This paper documents several key contributions concerning the glaucoma detection from SD-OCT volumes. Unlike the previous studies that addressed the problem using 3D CNNs, we reveal a new approach characterised by extracting features from the B-scans by an innovative 2D-CNN, and preserving the feature dependencies embedded in the latent space making use of LSTM networks [35] along with an additional proposed module. The combination of CNNs and LSTM networks has been successfully performed in recent studies to identify pathological biomarkers associated to AMD and diabetic macular edema (DME) [14], as well as to predict the progression of the ophthalmic diseases from different slit-lamp images [36]. However, to the best of the author’s knowledge, we are the first that suggest the use of CNN-LSTM to address the glaucoma detection, by assuming each spatial slide of the volume as a temporary instance. As a novelty, in order to attain the feature-extraction stage, we propose a new slide-level discriminator based on a pre-trained 2D-CNN model able to discern between healthy and glaucomatous cases just from raw circumpapillary OCT images. The proposed 2D-CNN feature extractor is composed of a novel combination of pre-trained convolutional blocks in parallel with residual modules trained from scratch. Additionally, an attention block was also included via skip-connection to focus on local related-glaucoma areas during the training phase. Moreover, we propose an innovative way of codifying the LSTM outputs implementing a sequential-weighting module (SWM) before addressing the final classification stage. The flowchart of the designed end-to-end system is exposed in Fig. 1, where we represent how the pre-trained circumpapillary base model extracts the features from the SD-OCT slides and how the three-dimensional information is analysed making use of LSTM networks to finally predict the class of each specific ONH-centred cube.

In the recent study [27], the authors claimed that they used 3D convolutions to be able to accomplish the 3D Class Activation Maps (CAMs) because otherwise the resulting CAM would be 2D and the depth information would be lost. Against the statement of [27], our LSTM-based model is capable of leveraging the spatial dependencies extracted from the SD-OCT slides to compute the 2D-CAMs sequentially. Thereby, we enable an interpretation of SD-OCT volumes based not only on identifying the regions of interest (ROIs) of each slide, but also the most relevant B-scans of the volume for glaucoma classification. At this point, it is important to note that we also replicate several architectures proposed in the literature to make a direct comparison between different methods. In particular, we test in our database the models of the state-of-the-art studies intended to glaucoma detection just from SD-OCT volumes, i.e. the work that reported the best results by Hong Kong and Stanford association [29], and the work carried out by IBM research group [27].

Section snippets

Material

Three different and independent databases were employed to accomplish this study, as indicated in Table 1. Two of them are related to circumpapillary OCT images and they were used to train and validate the proposed slide-level feature extractor. The third database is composed of the SD-OCT volumes from which we built the predictive models for glaucoma detection. Both the circumpapillary and SD-OCT volumes databases are centred around the optic nerve head (ONH) of the retina to extract the

Slide-level feature extractor design

The objective in this stage is to build a 2D-CNN architecture able to extract discriminatory features from the slides of the SD-OCT volumes. So, in our previous work [18], we carried out a validation of different architectures making use of the raw circumpapillary OCT samples. Specifically, the most common state-of-the-art architectures, as well as other CNNs trained from scratch, were considered. As detailed in [18], we proposed shallow networks from scratch due to the small amount of data,

Results

In this stage, we describe separately the experiments carried out to develop the 2D CNN-based feature extractor, and those performed to achieve the volume-based predictive model. For both stages, we detail three well-differentiated sections: data partitioning, validation phase and prediction stage. Additionally, in the case of the slide-level discriminator, we also report the results from an external validation to demonstrate that the proposed feature extractor can generalise to other databases

Discussion about the feature extractor

In contrast to the state-of-the-art studies, which performed the glaucoma detection from SD-OCT volumes through 3D architectures, in this paper, we propose a new way of addressing this task by using the spatial dependencies between 2D images, instead of operating in the three-dimensional space. Thereby, we have developed a new slide-level discriminator able to extract the features from the slides of the SD-OCT volumes. At this point, it should be remarked the importance of using pre-trained

Conclusion

In this paper, we have proposed an artificial-intelligence predictive model based on a new deep-learning strategy to address the glaucoma detection just from raw SD-OCT volumes. Specifically, the proposed model consists of a novel combination of CNN and LSTM networks that allows taking into account spatial dependencies between the B-scans of the volumes. For the first time, we have combined fine-tuning techniques with other convolutional blocks in parallel to build a slide-level feature

Funding

This work has been funded by GALAHAD project [H2020-ICT-2016-2017, 732613], SICAP project (DPI2016-77869-C2-1-R) and GVA through project PROMETEO/2019/109. The work of Gabriel García has been supported by the State Research Spanish Agency PTA2017-14610-I.

Declaration of Competing Interest

Manuscript title: Glaucoma Detection on Raw SD-OCT Volumes: a Novel Approach Focused on Spatial Dependencies.

The authors whose names are listed immediately below certify that they have NO affiliations with or involvement in any organization or entity with any financial interest (such as honoraria, educational grants; participation in speakers’ bureaus; membership, employment, consultancies, stoch ownership, or other equity interest; and expert testimony or patentlicensing arrangements), or

Acknowlgedgments

The authors gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan V GPU used here.

References (44)

  • G.A.U. National

    Glaucoma: diagnosis and management

    (2017)
  • D. Huang et al.

    Optical coherence tomography

    science

    (1991)
  • F.A. Medeiros et al.

    Detection of glaucoma progression with stratus oct retinal nerve fiber layer, optic nerve head, and macular thickness measurements

    Investigative ophthalmology & visual science

    (2009)
  • C. Sinthanayothin et al.

    Automated detection of diabetic retinopathy on digital fundus images

    Diabetic medicine

    (2002)
  • A. Diaz-Pinto et al.

    Retinal image synthesis and semi-supervised learning for glaucoma assessment

    IEEE transactions on medical imaging

    (2019)
  • I.I. Bussel et al.

    Oct for glaucoma diagnosis, screening and detection of glaucoma progression

    British Journal of Ophthalmology

    (2014)
  • P.R. Lichter

    Variability of expert observers in evaluating the optic disc.

    Transactions of the American Ophthalmological Society

    (1976)
  • T. Kurmann et al.

    Fused detection of retinal biomarkers in oct volumes

    International Conference on Medical Image Computing and Computer-Assisted Intervention

    (2019)
  • D.C. Hood et al.

    On improving the use of oct imaging for detecting glaucomatous damage

    British Journal of Ophthalmology

    (2014)
  • D. Bizios et al.

    Machine learning classifiers for glaucoma diagnosis based on classification of retinal nerve fibre layer thickness parameters measured by stratus oct

    Acta ophthalmologica

    (2010)
  • S.J. Kim et al.

    Development of machine learning models for diagnosis of glaucoma

    PLoS One

    (2017)
  • G. García et al.

    Glaucoma detection from raw circumpapillary oct images using fully convolutional neural networks

    2020 IEEE International Conference on Image Processing (ICIP)

    (2020)
  • Cited by (0)

    View full text