Data Augmentation with Generative Methods for Inherited Retinal Diseases: A Systematic Review

Machado, Jorge; Marta, Ana; Mestre, Pedro; Beirão, João Melo; Cunha, António

doi:10.3390/app15063084

Open AccessSystematic Review

Data Augmentation with Generative Methods for Inherited Retinal Diseases: A Systematic Review

by

Jorge Machado

^1,2

,

Ana Marta

^3,4

,

Pedro Mestre

^5,6,

João Melo Beirão

^3,4

and

António Cunha

^1,2,6,*

¹

Department of Sciences and Technologies, Universidade Aberta, 1269-001 Lisboa, Portugal

²

School Sciences and Technologies, Universidade de Trás-os-Montes e Alto Douro, 5000-801 Vila Real, Portugal

³

Department of Ophthalmology, Unidade Local de Saúde de Santo António, 4099-001 Porto, Portugal

⁴

Instituto de Ciências Biomédicas Abel Salazar (ICBAS), University of Porto, 4050-313 Porto, Portugal

⁵

Stirling College, Chengdu University, Chengdu 610106, China

⁶

ALGORITMI Research Centre, University of Minho, 4800-058 Guimarães, Portugal

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(6), 3084; https://doi.org/10.3390/app15063084

Submission received: 24 February 2025 / Revised: 5 March 2025 / Accepted: 11 March 2025 / Published: 12 March 2025

(This article belongs to the Special Issue Applied and Innovative Computational Intelligence Systems: 3rd Edition)

Download

Browse Figure

Versions Notes

Abstract

:

Inherited retinal diseases (IRDs) are rare and genetically diverse disorders that cause progressive vision loss and affect 1 in 3000 individuals worldwide. Their rarity and genetic variability pose a challenge for deep learning models due to the limited amount of data. Generative models offer a promising solution by creating synthetic data to improve training datasets. This study carried out a systematic literature review to investigate the use of generative models to augment data in IRDs and assess their impact on the performance of classifiers for these diseases. Following PRISMA 2020 guidelines, searches in four databases identified 32 relevant studies, 2 focused on IRD and the rest on other retinal diseases. The results indicate that generative models effectively augment small datasets. Among the techniques identified, Deep Convolutional Adversarial Generative Networks (DCGAN) and the Style-Based Generator Architecture of Generative Adversarial Networks 2 (StyleGAN2) were the most widely used. These architectures generated highly realistic and diverse synthetic data, often indistinguishable from real data, even for experts. The results highlight the need for more research into data generation in IRD to develop robust diagnostic tools and improve genetic studies by creating more comprehensive genetic repositories.

Keywords:

generative models; data augmentation; synthetic data generation; retinal diseases; inherited retinal diseases

1. Introduction

Inherited retinal diseases (IRDs) are a group of diseases belonging to rare diseases, also falling into a group of heterogeneous conditions. These diseases, due to the progressive degeneration of the retina, lead to loss of vision. This loss can occur due to dysfunction or degeneration of the retinal photoreceptors or the pigmented epithelium [1]. IRDs are mostly caused by Mendelian mutations in 1 of at least 300 different genes. The clinical symptoms associated with these diseases vary, not only according to the different subtypes of IRDs, but also according to the different disease genes [2]. This type of disease has a prevalence of approximately 1 in 3000 individuals, out of a total of more than 2 million people worldwide [3,4]. The IRDs have a profound impact, not only on patients, but also on society, as they are the most frequent hereditary forms of human visual impairment [5].

Artificial intelligence (AI) based on deep learning has shown great potential in various medical fields, especially for well-defined clinical tasks where imaging data, such as autofluorescence ocular fundus (FAF) and optical coherence tomography (OCT), contain most of the relevant information for the specific analysis [6].

In medical areas such as radiology, pathology, and dermatology, which rely heavily on diagnostic imaging, ophthalmology shares this similarity and dependence on diagnostic imaging, making it the most prominent application of AI in healthcare [7]. The advantages brought by AI are truly advantageous, as it allows us to manage the complexity of 21st century ophthalmology. In this way, using efficient algorithms that, in addition to detecting pathologies, are also able, from large volumes of image data, to help reduce diagnostic and therapeutic errors, promoting personalized medicine. All of this is possible by being able to “learn” characteristics from these data. In addition, AI can recognize specific disease patterns and gain innovative scientific insights by correlating new characteristics [6].

The use of deep learning in IRDs could be a valuable support tool for healthcare professionals, helping not only to detect these pathologies, but also to classify the gene responsible for them. This could accelerate diagnosis and enable the application of the appropriate treatment for each specific gene.

Despite the great advances that artificial intelligence has made in medicine and, consequently, in ophthalmology, there has been a difficulty described by numerous authors in IRDs. They were faced with the limitation of small datasets for training deep learning models while developing their models for classifying and detecting this type of pathology [8,9].

So, despite the success of using AI in retinal diseases, it has not yet been possible to exploit its full potential in IRDs. This is due to the scarcity of data on these diseases for use in training deep learning models, which have small datasets. This limitation has implications for the effectiveness and reliability of classifiers in these diseases, as identified by some authors [8,9]. It is seen as a limitation because deep learning models need large amounts of data to train models to be more effective and have a greater capacity for generalization. In addition to the scarcity of data for training, the process of efficiently labeling data in the medical field is an expensive process, as it requires years of specialized training [10].

Some ways of overcoming this limitation are using traditional data augmentation techniques. These techniques are already used to overcome this difficulty and to balance and augment the training dataset. They use resampling approaches and random image transformations such as rotation and blurring. However, these techniques have been shown to have a moderate impact on model generalization [11]. Some data augmentation techniques augment the training dataset by generating new synthetic data from classes with little representation, using generative models, for example, generative adversarial networks (GANs), which are a special type of deep learning [12,13].

Although these generative techniques are still little explored in the context of IRDs, especially using FAF eye exams [14], they have already been explored in other retinal pathologies. Studies in these areas demonstrate the effectiveness of these approaches in generating synthetic data and using them to train deep learning models [15,16]. These studies show that using synthetic data to train models on small datasets generally improves their accuracy.

Deep generative models (DGMs) have gained significant attention in artificial intelligence due to their ability to approximate high-dimensional complex probability distributions and generate realistic synthetic data. These models—including generative adversarial networks (GANs), variational autoencoders (VAEs), and normalizing flows—have been successfully applied in various domains, such as image synthesis, speech generation, and deepfake technology [17]. Despite these advancements, challenges remain in optimizing their design and training for specific applications [17].

In the medical field, particularly in ophthalmology, DGMs have shown great potential for generating synthetic retinal images. These images can be used to augment small datasets, addressing data scarcity issues, and enhancing deep learning models for detecting inherited retinal diseases (IRDs).

Despite promising results in retinal diseases, the evaluation of synthetic data remains a significant challenge. Most medical and GAN-based studies use a combination of metrics: qualitative, which involves visual examination and relies on feedback or description [18,19] but can be inaccurate because it is dependent on the experience of the evaluator; and quantitative, which evaluates images in terms of visual fidelity, diversity, and generalizability [14].

The purpose of this literature review is to investigate the use of generative models in IRDs to generate synthetic autofluorescence ocular fundus images for these diseases. We intend to identify which techniques and architectures are most used in other retinal diseases, as well as which metrics are used to evaluate the results obtained so that they can be applied to IRDs in the future. Finally, we also aim to review the literature for cases where the use of synthetic data, particularly FAF, in small datasets for training classifiers has improved their efficiency.

2. Materials and Methods

The systematic review was conducted following the PRISMA guidelines [20]. Exploratory research was conducted on PROSPERO [21] to see if any systematic reviews answered our research question. PROSPERO is a database created by the Center for Reviews and Dissemination (CRD) and funded by the National Institute for Health and Care Research (NIHR). It currently has more than 325,000 reviews on health-related outcomes. It aims to promote transparency and open science by providing easy access to essential information about reviews that are planned, underway, or completed. In this way, it helps to avoid unintentional duplication and wasted research studies. As a result of this research, no studies were found that covered the topic of our systematic review. However, the review protocol was not registered in PROSPERO due to time constraints, namely, in drawing up the detailed protocol and waiting for approval, as the process of registering with PROSPERO is time-consuming.

2.1. Related Work

With the growth of artificial intelligence, some studies are beginning to emerge involving deep learning techniques in the context of IRDs, involving classifiers to detect these diseases based on ophthalmic examinations [22]. However, in the context of IRDs, the use of deep learning techniques as a tool to aid diagnosis is still an under-explored area. One of the reasons for this problem is the small size of the dataset used to train deep learning classifiers, because there are insufficient amounts of data to train these classifiers efficiently. This is the main limitation presented by some authors in their studies [8,9,23].

In other areas of retinal research, studies involving generative models for creating synthetic eye exams to enhance small datasets have been considered a viable solution to overcome this limitation identified by some authors [8,9,23,24]. A preliminary search was conducted to determine what had already been studied in this area within the context of IRDs. However, no systematic reviews were found, and only two studies specifically addressing the use of these techniques in IRDs were identified. Thus, the application of synthetic data to augment small datasets and train classifiers has not yet been widely explored for these diseases.

With the results obtained, it was necessary to broaden the scope of the search for this systematic review of retinal diseases and not just IRDs, as we had already seen in the great scarcity of studies.

These findings demonstrate that there is great potential for research in this context in IRDs that is not yet being explored, and that there is a lot of work involving generative models to increase data in these diseases.

The lack of systematic reviews on this subject reinforces the need for a review that collects all of the existing knowledge on this topic. By consolidating existing research on generative models in retinal diseases and evaluating their potential application in IRDs, this study will constitute an important reference for future research and clinical applications. This could allow future researchers to know what has already been done on this subject and to apply these techniques in their studies. This will allow for an increase in studies involving generative models in IRDs to be able to generate synthetic data to augment reduced-size training datasets. Consequently, it is intended to eliminate the limitations mentioned in the literature, the scarcity of data for training classifiers, and improve their efficiency [8,9].

2.2. Search Question

This systematic review aimed to address the following research questions:

Can generative methods be used to generate ocular fundus autofluorescence data in inherited retinal diseases?
Which generative models are most efficiently used to generate synthetic ocular fundus autofluorescence data in inherited retinal diseases?
How should synthetic ocular fundus autofluorescence data generated by generative models be evaluated, so that it can be used to train classifiers in inherited retinal diseases?

In this way, we can understand what has already been done in the field of generative models for data augmentation in the context of IRDs and how future contributions can be made to the subject. However, due to the lack of studies specifically focused on inherited retinal diseases in this context, it was necessary to broaden the scope to include retinal diseases in general. This broader approach allowed us to find more studies relevant to our objective, although they address other retinal conditions.

2.3. Search Strategy

The search was conducted on 1 October 2024, which was the cut-off date for the inclusion of studies. No systematic reviews were found that met our objective. The databases included in this study and the reason why they were chosen can be seen in Table 1. These were selected due to the characteristics that each one possesses to guarantee a balance between the themes of this review.

In the searches, controlled terms (Medical Subject Headings—MeSH terms) and uncontrolled terms (free text words) were used and combined with Boolean operators to create a comprehensive and effective search. The MeSH terms used in the search were “Retinal Diseases” (MeSH) and “Retinal Degeneration” (MeSH). The rest of the terms used were free-text expressions. This search term, used in this study can be found in Appendix B, Table A2, which contains all the steps for conducting the search.

Initially, the search was limited to IRDs, but as there were no results from this search, we were “forced” to broaden the context of the search to include all retinal diseases. In this way, some literature has already emerged for analysis.

To ensure methodological rigor, we adopted the PICo method for systematic reviews (Population, Phenomena of Interest, Context), proposed by the Joanna Briggs Institute (JBI) manual [25].

The inclusion and exclusion criteria for this study are detailed in Table 2 and Table 3. One of the inclusion criteria is articles published from 2019 onward, as no relevant studies were found before this period.

2.4. Data Extraction and Synthesis

The studies resulting from the search were imported into Rayyan: AI-Powered Systematic Review Management Platform (https://www.rayyan.ai/), which is a web platform used to help with searches, particularly systematic reviews. Using this software, we started selecting articles. Duplicate articles were identified and removed. Next, the title and abstract were carefully analyzed to see which articles satisfied the inclusion criteria and which satisfied the exclusion criteria. This stage was also carried out on the same platform, allowing the articles to be selected for the next stage. In the next stage, and with the articles selected from the title and abstract, a full reading of the articles was carried out to determine whether they would be included in this study.

To keep the selection rigorous and respectful of the criteria, these steps are carried out by two researchers, who then check and compare the results obtained from the selection of articles. Any disagreements encountered during the process of selecting the articles to be included were analyzed and discussed with the working group in order to reach a better decision.

After selecting the articles, we extracted the essential information for this systematic review. No software was used to carry out this process; however, Microsoft Excel (version 2501 Build 16. 0. 18429. 20132), was used to organize information into tables with the content extracted from the articles. Of the information extracted for this study, the following were defined as important: year of publication, architectures used, evaluation metrics, disease to which the study applies, datasets used, and main results (how the architectures performed, not in terms of numbers obtained, but in results described by the authors). These tables will be presented in a reduced version in Appendix C, Table A3.

3. Results

This section will present the research results, the excluded articles, and the reasons for their exclusion (as shown in Figure 1), as well as the architectures, metrics, and main results reported in the included studies.

Characteristics of the Included Studies

Thirty-two articles were obtained between 2019 and 2024 as material for this systematic review. These publications have been distributed over the years, but the number of publications has been increasing, as can be seen in Table 4.

Table 5 shows the architectures covered in the studies analyzed. We obtained 20 different architectures of generative models distributed over 32 articles. Some publications use more than one architecture, and others are present in more than one article.

Based on the studies analyzed, we assessed the strengths and weaknesses of the most frequently used architectures in this review. We selected base architectures and examined articles that implemented these architectures, where the authors discussed their advantages and limitations in generating data for various retinal diseases. The results of this analysis are summarized in Appendix A Table A1. Other architectures were not included in the table, as they are modifications or enhancements of the base architectures already listed.

These studies also provided insights into the metrics used to evaluate the results obtained. Many of these metrics are commonly employed across various studies due to their specific relevance to this type of analysis. A comprehensive list of these metrics is presented in Table 6.

The studies were categorized into four different contexts: two groups covering generic disease categories—Hereditary Retinal Diseases and Retinopathy—where the specific disease was not specified; studies focused on six specific diseases (e.g., Diabetic Retinopathy, Retinopathy of Prematurity); studies centered on retinal disease indicators (e.g., Hard Exudates), which signal the presence of an underlying condition; and studies on retinal blood vessel analysis, which, while not a disease itself, is a key retinal characteristic. This distribution is presented in Table 7.

4. Discussion

This study found some important points about the work being carried out in the field of Artificial Intelligence in the context of inherited retinal diseases, in particular, data augmentation using generative methods.

A search of the literature revealed that although there is a great shortage of data in the IRDs as a group [8,9], there are not many studies dealing with these techniques. As of the date of this review, only two studies have been found on IRDs that used generative methods for data augmentation.

Regarding these two studies, in Yoo et al. [26], they were able to generate new images of unbalanced classes to augment the data from a few real images. This resulted in better classifier performance by using the synthetic images alongside the real ones in the training set and model training. The same happened in a study conducted by Veturi et al. [14], which also managed to generate images with a high level of realism that misled clinical experts in the field of ophthalmology, who wrongly assessed some of them as real and synthetic. Using these data for training did not worsen the classifier’s performance, keeping it like using real images for training.

In Age-related Macular Degeneration, the authors, using StyleGAN2-ADA Oliveira et al. [27], a StyleGAN2 Wang et al. [28], a StyleGAN Kim et al. [29], and a ProGAN P. M. Burlina et al. [30] were able to generate high-resolution synthetic images, like real ones, from just a few images. These new data made it difficult for ophthalmologists to distinguish between real and synthetic data. In He et al. [31], which used architecture composed of DCGAN and WGAN-GP, high-quality images were generated and, like Oliveira et al. [27], using the generated data to train the classifiers improved their accuracy.

In the case of Geographic Atrophy, which is a subtype of Age-related Macular Degeneration, Wu et al. [32] used a VR-GAN architecture that generated high-quality images, some of which were better than real images.

In Crystalline Retinopathy disease, Young Choi et al. [33], using CycleGAN architecture, managed to generate realistic images of the pathology. Using these data in training as data augmentation improved the model’s accuracy.

By using the RV-GAN architecture for degenerative retinal diseases, Kamran et al. [34] were able to generate new, higher quality data for monitoring and tracking this type of disease.

Diabetic Retinopathy is where we have more studies with different architectures. P. Burlina et al. [35], which used a StyleGAN, Magister and Arandjelovic [36], which used a WGAN, and in Kabilan et al. [37], which used a DCGAN, were able to generate realistic images, with variety for data augmentation and balancing classes in the datasets. Using DiaGAN architecture, Shoaib et al. [38], a RF-GAN Y. Chen et al. [15], a VSG-GAN J. Liu et al. [39], and a DR-GAN Zhou et al. [40] generated realistic images with high fidelity. Using these images to train the classifiers improved their performance and generalization capacity.

Using VAE in Diabetic Macular Edema disease, which is a more advanced manifestation of Diabetic Retinopathy, Tajmirriahi et al. [41] generated more realistic and high-resolution images than other GAN architectures. This made it possible to increase the dataset with the images generated, for training the classifier, and its efficiency improved. On the other hand, Tripathi et al. [42] used StyleGAN2 architecture, which made it possible to generate realistic images that were very similar to real images.

In Epiretinal Membrane disease, Choi et al. [43], using a StyleGAN2 architecture, was able to generate more realistic images with better quality than other methods used. Using these generated data to train the model also improved its performance in detecting this disease. Additionally, J. Guo et al. [44], using joint WGAN and CGAN architecture, managed to generate high-quality images with great diversity to identify signs of hard exudate, which are retinal signs of disease.

In Retinal Blood Vessel disease, Alsayat et al. [45] used LDM to generate images with high quality and diversity in order to add to the training dataset and thus create a more comprehensive and diverse training set. On the other hand, using DCGAN architecture, HaoQi and Ogawara [46] were able to generate images that maintain the statistical distribution of the original data set, obtaining better results than other methods (Tub-GAN, DGAN, MI-GAN).

In the retinopathy group, where they do not specify any specific type of disease, L. Liu et al. [47], using a CycleGAN architecture, was able to generate images with realistic features and better quality. The accuracy of the classifiers was improved by using these data to train them.

In the Retinopathy of Prematurity Disease, J. S. Chen et al. [48], using Pix2Pix HD architecture, generated realistic images that made it difficult for experts to distinguish between real and synthetic images. Using an ROP-GAN, Hou et al. [16] managed to generate images of classes with little data and at various stages of the disease. Using these synthetic data to train the classifiers improved their accuracy.

In studies that do not specify a retinal disease, but only refer to the group of retinal diseases, in segmentation problems, Beji et al. [49] used DCGAN architecture to generate high-quality images. These synthetic data improved segmentation performance. Using StyleGAN2-ADA architecture, Sun et al. [50] generated images to balance the dataset. Accuracy improved when the models were trained with synthetic data and the dataset balanced with these data. Using a DCGAN, Lei et al. [51] generated images with reasonable detail. Using these images for training improved the model’s performance by adding diversity and increasing its generalization capacity. Using CycleGAN, Li et al. [52] generated low-quality images and then synthesized them to generate high-quality, high-resolution images from these low-quality images. Additionally, Han et al. [53] generated OCT images that are very similar to real images. According to the metrics obtained, these images are comparable to real images.

Most of the studies reviewed focus on the generation of synthetic FAF or OCT images based on the type of ophthalmic examination data used for training. Although the datasets (whether public or private) often come from several hospitals, few studies explicitly discuss the impact of domain shift, or the variability introduced by different imaging devices. This aspect was not widely explored in the studies reviewed, leaving open questions about how generative models deal with variations between datasets from different institutions and imaging modalities.

It was found that the analyzed studies used several metrics to evaluate the results obtained, the most frequent being FID [54], ROC/AUC, ACC, SSIM [55], sensitivity, and specificity, as well as qualitative evaluations conducted by retinal specialists. The combination of multiple metrics in the evaluation of results reflects a more robust and reliable approach. Qualitative assessment by retinal specialists is the most noteworthy because, in many cases, it is difficult to distinguish real images from synthetic images. This is a metric that gives more confidence because it is carried out by specialists in the field, but it can lead to greater error because it depends heavily on the individual knowledge and experience of each specialist. The use of a wide variety of metrics in the studies analyzed only reflects the variety of evaluations and validations that can be made to the results in order to obtain more reliable results. A deep understanding of the metrics used to evaluate synthetic data is essential, both in terms of how they work and the interpretability of the results. Combining quantitative and qualitative metrics, including evaluation by retinal experts, can increase the reliability of the analyses. In addition, understanding the criteria used by experts to differentiate between real and synthetic data can offer a more comprehensive perspective and strengthen the robustness of the conclusions drawn.

Although the metrics analyzed in the studies are widely used, they may not be sufficient for a complete assessment of the quality of synthetic images. The inclusion of additional metrics, such as IS and PSNR, could provide more information, despite the increased complexity of the analysis. Still, expert evaluation remains essential, especially considering that GANs are not yet fully optimized. While quantitative metrics offer objective data on the realism of images, complementary methods—such as perceptual analysis by experts or validation based on clinical tasks (e.g., impact on the performance of classifiers trained with synthetic data)—can increase confidence in the clinical applicability of these images.

The results of this systematic review reinforce the ability of generative models to create synthetic data with realistic characteristics, often indistinguishable from real images, as stated by several authors. In some cases, the synthetic data presented have an even higher resolution and degree of realism compared to the original data.

Notably, in evaluations conducted by ophthalmic specialists, synthetic and real data were frequently indistinguishable.

These results were consistent across various retinal diseases, including inherited retinal disorders, with no clear evidence suggesting that specific conditions yielded superior results. However, data availability remains a critical factor. More prevalent diseases, such as Diabetic Retinopathy, benefit from larger datasets, whereas rare IRDs often face challenges due to data scarcity. Despite this limitation, several studies have highlighted the ability of generative models to produce realistic, high-quality FAF images, even when trained on small datasets, reinforcing their potential to mitigate data scarcity in these conditions.

Furthermore, studies that integrated synthetic data with real data to train classifiers reported improved model accuracy and generalization capabilities. This enhancement is largely attributed to the increased diversity of training data, suggesting that synthetic FAF images could play a crucial role in optimizing deep learning models for diagnosing hereditary retinal diseases.

An important aspect to consider when using generative models for data augmentation is their computational efficiency. Among the 32 studies analyzed, different GAN architectures were used, each with different computational requirements to generate synthetic retinal images. Some of the studies analyzed only refer to the execution environment of the architectures, without referring to or comparing the computational cost that this required. However, more complex models, such as StyleGAN, usually require high-performance GPUs and long training times (can generate images with short training periods), to produce high-fidelity images, while simpler architectures, such as DCGAN, may be more efficient from a computational point of view, but potentially generate lower quality samples. In order to apply these methods in the real world, to generate synthetic data for training deep learning classifiers, the need and quality of the data must always be taken into account, as opposed to the resources available to generate these data. It is therefore important to know the computational requirements of each model so that it can be used, in accordance with the resources available, to increase the data in the IRDs and, in turn, use it to train the classifiers.

Based on this review, DCGAN and StyleGAN2 were the most widely used architectures, reflecting their popularity in generating synthetic data. The metrics most used to evaluate the architectures include FID, ROC/AUC, ACC, SSIM, and other relevant metrics. However, the results show significant variability between the studies, indicating that the choice of the best architecture depends on the specific context of the application. Considering the cases analyzed and choosing the most used, StyleGAN2 stands out as the most recommended option, especially in scenarios with a reduced number of training data, due to its ability to generate highly realistic samples. However, DCGAN remains a robust alternative due to its simplicity and stability in training, which makes it a viable choice in applications that require less computational complexity. Thus, the selection of the most suitable architecture must take into account the volume of data available and the specific requirements of the study.

This study has limitations. One of the limitations identified in this systematic review is the small number of studies dealing with generative methods for data augmentation in IRDs. This scarcity makes it difficult to explore the potential of generative models for data augmentation in these diseases, which has a direct impact on the development of robust and reliable classifiers. The lack of studies involving these techniques for data augmentation contributes to keeping datasets small, with unbalanced classes that compromise the efficiency of classifiers and their ability to generalize.

As future work, continuing research in this area has great potential for advances, not only in generative methods, but also in IRDs, in the development of methods and tools to support professionals in detecting and classifying these diseases efficiently and reliably. The information identified in the literature on other retinal diseases could be applied to IRDs in future work, thus boosting progress and development in this area. In addition, future studies could include standardization of architecture and metrics that allow a more direct comparison between the results obtained and thus promote consistent advances in the area.

5. Conclusions

This review found that there is still a lot of work to be done in artificial intelligence in this context. As for the difficulty most pointed out by the authors, which is the scarcity of data for training deep learning classifiers, there are still few studies that address credible and effective solutions to these difficulties. Few studies address data augmentation for IRDs. There are still no significant efforts to balance datasets, such as balancing classes by generating synthetic data. This would improve the effectiveness not only of classifiers for these diseases, but also in identifying the genes responsible for them.

In this work, approaches to other retinal diseases could be adapted to IRDs and, in this way, realistic high-resolution data could be obtained that would allow the training datasets to be balanced and, consequently, increased. This will increase the diversity of data and the generalization capacity of the models. These future approaches aim to combat the difficulty presented in the literature, which motivated this review.

This review contributes with a collection of information on the use of generative models for data augmentation in retinal diseases, which in future studies could serve as the base article for the development of new studies applying these identified techniques to IRDs. Given the scarcity of studies on these diseases and the need for generative approaches to address their limitations, it is crucial to develop generative methods in future research that can be effectively applied to these conditions. By understanding the techniques used in this review and applying them to IRDs, a significant start can be made in the detection and classification of these diseases, opening the way for the development of more essential and capable AI tools in the diagnosis and treatment of IRDs.

Author Contributions

Conceptualization, J.M.; methodology, J.M.; software, J.M.; validation, A.M., J.M.B., P.M. and A.C.; formal analysis, A.M.; investigation, J.M.; resources, A.M.; writing—original draft preparation, J.M.; writing—review and editing, A.C., A.M. and P.M.; visualization, A.M.; supervision, A.C.; project administration, A.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed at the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

IRD	Inherited Retinal Diseases
FAF	Autofluorescence Ocular Fundus
OCT	Optical Coherence Tomography
AI	and optical coherence tomography
GAN	Generative Adversarial Network
PRISMA	Preferred Reporting Items for Systematic reviews and Meta-Analyses
MeSH	Medical Subject Headings
DME	Diabetic Macular Edema
DR	Diabetic Retinopathy
AMD	Age-related Macular Degeneration

Appendix A

Architectures Analyses

Table A1. Strengths and weaknesses of base architecture.

Architectures	Diseases	Strengths	Weaknesses
DCGAN [37]	DR	The DCGAN architecture is successful in generating synthetic images, including medical images such as FAF images. DCGAN allows processing to be focused on specific regions of the images, which is useful for tasks such as analyzing small areas or areas of interest in the images.	DCGAN has difficulties with unbalanced datasets, especially when the images of different severities of the condition are poorly represented. To generate realistic images, DCGAN usually needs a large amount of data to learn the probability of distribution of real data. DCGAN also suffers from instability problems, such as non-convergence and mode collapse. While the images generated by the DCGAN are structurally similar to real images, they can present notable distortions in the frequency domain.
StyleGAN2 [14]	IRD	StyleGAN2 generates synthetic images of hereditary retinal diseases with high visual quality, which experts classify as realistic. The images generated have similar diversity to the real ones and are not exact copies, making them suitable for further analysis. Models trained with synthetic data offer similar classification performance to real data. StyleGAN2 is chosen for its image quality, short training time, and relatively low computational cost.	The quality of synthetic images can be compromised by problems such as low exposure and background leakage. The qualitative evaluation of images is subjective, with great variation in scores and possible confusion caused by overlapping and atypical phenotypes. Disease classes overlap in the feature space, which can cause confusion when generating synthetic images. GAN can memorize images or capture subtle attributes, especially if the dataset is small or the training long. The quality of synthetic images is lower than that of real data.
CycleGAN [26]	IRD	Effective in generalizing OCT images of rare diseases with few examples, avoiding overfitting. Maintains the structures of the choroid and peripheral retina when translating normal images into pathological ones. Creates new samples from normal OCT images to increase variation in rare disease classes. Extensible technique for segmentation with small datasets. Generates synthetic images with transformation of morphological characteristics.	Synthetic images can have artifacts, requiring careful selection to build deep learning models. OCT images generated have a resolution of 256×256, which can impact classification. High computational cost to train high-resolution models. Some rare diseases, such as Stargardt and retinitis pigmentosa, have high rejection rates.
CGAN [53]	Retinal Diseases	Generation of images conditional on labels or other auxiliary information, making it possible to create specific synthetic datasets. High quality and realism of the images generated, which suggests that synthetic images closely resemble real images. The ability to generate a large number of synthetic images demonstrates CGAN’s potential to augment existing medical datasets.	CGAN is an opaque model, which means that it can be difficult to fully understand the internal process of generating the images. The images generated by CGAN may contain artifacts or be of variable quality. CGAN’s performance depends heavily on the quality and diversity of the actual training data. Training GANs, including CGANs, can be challenging and unstable, requiring careful selection of hyperparameters and training strategies to avoid problems such as model collapse.
WGAN [36]	DR	WGAN is effective at generating high-quality synthetic images with great diversity and realism. The architecture is robust and successful in challenging imaging tasks. WGAN solves one of the main problems in traditional GANs (such as DCGAN), mode collapse, allowing a wider variety of samples to be generated. Using the Wasserstei distance improves the convergence and quality of the images, helping the generator to produce more realistic samples. The images generated have a high degree of realism, with variation in the characteristics of real images, such as the optic disc and veins. The synthetic images generated are almost indistinguishable from the real ones, with a slight difference in noise.	Training can be unstable, with oscillations in the error. When examining the images, you can see that small artefacts remain, and the image is slightly more pixilated than the original images. The WGAN discriminator estimates the parameter w of the continuous K-Lipschitz function, rather than the probability of the image being real, making it unsuitable for inpainting. The automatic inpainting methodology is limited by the size of the image produced by the WGAN.
StyleGAN [29]	AMD	StyleGAN is effective at creating high-resolution images, such as retinal images that are almost indistinguishable from the real thing. StyleGAN can generate retinal images with specific diseases, even with little training data, and can balance unbalanced datasets, improving the performance of diagnostic models. The intermediate latent space allows visual attributes to be modified in the images, which provides more flexibility compared to other GANs. StyleGAN successfully maintains vascular structures in retinal images, which was not a priority in previous studies. StyleGAN has great potential in medical areas, such as data augmentation and protection of patient privacy.	Training StyleGAN to generate high-quality synthetic images is still challenging due to the need for a lot of data. The images generated need improvements in microstructures, such as the representation of the optic disc and vascularization. Some atypical features, such as abnormal structures in the optic disc, irregular vessels, and atypical reflections, were observed in the images generated.
StyleGAN-ADA [27]	AMD	Experts were unable to accurately differentiate between real and synthetic images, demonstrating the realism of the images generated. The model trained on three public datasets and performed well on an external dataset, demonstrating its ability to generalize in the detection of AMD. It allows synthetic images to be created where macular degeneration is evident or absent. The use of ADA improves the quality and diversity of the images, making the model more efficient, especially when there is little training data.	If the synthetic data is not representative or diverse enough, it can introduce noise into the training of deep learning models. Validation with a small dataset may not be enough to guarantee robust generalization.
LDM [45]	Retinal Blood Vessel	LDM is effective in data augmentation. It combines diffusion models with autoencoders to generate smaller latent representations, making processing more efficient. LDM uses L1 loss, perceptual loss, and patch-based adversarial objectives to optimize autoencoder training. LDM outperforms other GAN architectures in terms of efficiency, with better results in FID (lower value, higher quality) and IS (higher value, more diversity in the images). LDM generated high-quality data, with a high inception score, and low FID, indicating that the augmented data is very similar to the original dataset.	The LDM still faces difficulties in generating high-resolution images during the data augmentation process.
VAE [41]	DME	VAE networks are naturally stable, giving greater stability to the GAN network when used together. The integration of the VAE improves the stability of the GAN, providing faster convergence and avoiding mode collapse. The structure of the pre-trained VAE decoder is transferred to the generator, improving initialization and performance.	VAEs, in general, can generate less sharp and realistic images than GANs, often resulting in blurrier images. The article focuses more on how VAE improves GAN, without detailing other specific limitations.

Appendix B

Research Specification

Table A2. Specification of the research sentence and results obtained.

Database	Search Sentence	Filters	Number of Studies
PubMed	(Generative Model* [Text word] OR Generative adversarial network [Text word] OR GAN [Text word] OR Data Augmentation [Text word] OR Synthetic Data Generation [Text word] OR Augmentation Techniques [Text word]) AND (Retinal Diseases [MESH] OR Retinal Degeneration [MESH] OR Retinitis Pigmentosa [MESH] OR Inherited retinal disease* [Text word])	No filters	85
IEEE Xplore	(Generative Model* OR Generative adversarial network OR GAN OR Data Augmentation OR Synthetic Data Generation OR Augmentation Techniques) AND (Retinal Diseases OR Retinal Degeneration OR Retinitis Pigmentosa OR Inherited retinal disease*)	Search by: “All Metadata e Mesh_Terms (Retinal Diseases OR Retinal Degeneration OR Retinitis Pigmentosa)”	187
Web of Science	(Generative Model* OR Generative adversarial network OR GAN OR Data Augmentation OR Synthetic Data Generation OR Augmentation Techniques) AND (Retinal Diseases OR Retinal Degeneration OR Retinitis Pigmentosa OR Inherited retinal disease*)	Search by: “Topic”, no filters	356
Scopus	(“Generative Model” OR “Generative adversarial network” OR “GAN” OR “Data Augmentation” OR “Synthetic Data Generation” OR “Augmentation Techniques”) AND (“Retinal Diseases” OR “Retinal Degeneration” OR “Retinitis Pigmentosa” OR “Inherited retinal disease”)	Search by: Article title, Abstract, keywords, no filters	114

* Legend: This “*” is used in the search phrase to indicate that those names can be found in the plural (e.g., Models).

Appendix C

Results of Included Studies

Table A3. Summary of the main results obtained in the studies included in this review.

Articles	Authors	Publication Date	Architectures	Evaluation Metrics	Diseases	Results
Assessment of Deep Generative Models for High-Resolution Synthetic Retinal Image Generation of Age-Related Macular Degeneration	Burlina, Philippe M, et al.	10 January 2019	ProGAN	Retinal specialists	Age-related Macular Degeneration	Greater equality in the results.
Geographic atrophy segmentation in SD-OCT images using synthesized fundus autofluorescence imaging	Wu, et al.	27 October 2019	RA-CGAN	PSNR, SSIM	Geographic Atrophy	It generated high-quality images. Some with better quality than the real images.
CGAN-based Synthetic Medical Image Augmentation between Retinal Fundus Images and Vessel Segmented Images	HaoQi, et al.	1 January 2020	DCGAN	ROC/AUC, PR, Dice Coefficient, Sensitivity, Specificity	Retinal Vessel Analysis	It generates realistic images that maintain the statistical distribution of the original dataset.
Retinal optical coherence tomography image classification with label smoothing generative adversarial network	He, et al.	12 May 2020	DCGAN, WGAN-GP	Precision, Sensitivity, Specificity, F1	Age-related Macular Degeneration	Generated high quality images. Improved the performance of the classifier by using the data generated in training it.
Study on the Method of Fundus Image Generation Based on Improved GAN	Guo, et al.	12 June 2020	WGAN, CGAN	SSIM, SD, IS, FID	Hard Exudate	They generate high-quality images with great diversity.
Feasibility study to improve deep learning in OCT diagnosis of rare retinal diseases with few-shot classification	Yoo, et al.	25 January 2021	CycleGAN	Retinal specialists	Inherited Retinal Disease	They improved the performance of the models by using these data.
Addressing Artificial Intelligence Bias in Retinal Diagnostics	Burlina, Philippe, et al.	1 February 2021	StyleGAN	ACC, ROC/AUC	Diabetic Retinopathy	Use of generated data to balance dataset; Improved the values of the metrics obtained. Greater equality in the results.
RV-GAN: Segmenting Retinal Vascular Structure in Fundus Photographs Using a Novel Multi-scale Generative Adversarial Network	Kamran, et al.	14 May 2021	RV-GAN	ROC/AUC, ACC, Sensitivity, Specificity, F1, Mean-IoU, SSIM	Degenerative Retinal Diseases	Generate new data with higher quality.
Deepfakes in Ophthalmology Applications and Realism of Synthetic Retinal Images from Generative Adversarial Networks	Chen, et al.	29 October 2021	Pix2Pix HD	ACC, Retinal specialists	Retinopathy of Prematurity	It generates realistic images. Experts found it difficult to distinguish between real and synthetic images.
Generative Image Inpainting for Retinal Images using Generative Adversarial Networks	Magister, et al.	4 November 2021	WGAN	ACC, SNR, Evaluation of the Coherence of the Inpainted Image	Diabetic Retinopathy	Generated realistic images with great variety.
RF-GANs: A Method to Synthesize Retinal Fundus Images Based on Generative Adversarial Network	Chen, et al.	10 November 2021	CGAN	FID, SWD	Diabetic Retinopathy	Generated images with high fidelity and appearance. Improved the performance of Diabetic Retinopathy classifiers using the data generated. Improved the generalization capacity of the model as the generated data added diversity.
Synthesizing realistic high-resolution retina image by style-based generative adversarial network and its utilization	Kim, et al.	1 January 2022	StyleGAN	Retinal specialists, SNR, ROC/AUC, Sensitivity, Specificity, ACC	Age-related Macular Degeneration	It generates highly realistic images. Experts found it difficult to distinguish between real and synthetic images.
DR-GAN: Conditional Generative Adversarial Network for Fine-Grained Lesion Synthesis on Diabetic Retinopathy Images	Yi Zhou, et al.	1 January 2022	DR-GAN	Retinal specialists, FID, SWD, ACC	Diabetic Retinopathy	Images generated for data augmentation. They improved the performance of the models by using these data.
An innovative medical image synthesis based on dual GAN deep neural networks for improved segmentation quality	Beji, Ahmed, et al.	30 May 2022	DCGAN	SSIM, MSE, PSNR, SIFT, Oriented FAST and Rotated BRIEF (ORB)	Retinal Diseases	Improved the values of the metrics obtained.
A Dual-Discriminator Fourier Acquisitive GAN for Generating Retinal Optical Coherence Tomography Images	Tajmirriahi, Mahnoosh, et al.	11 July 2022	VAE	Euclidean Distance, FID, MS-SSIM	Diabetic Macular Edema	Generated more realistic, high-resolution images than other GAN architectures. Increased the dataset. Improved the efficiency of the classifier by using synthetic images to train it.
LAC-GAN: Lesion attention conditional GAN for Ultra-widefield image synthesis	Lei, et al.	11 November 2022	DCGAN	Retinal specialists, AACC	Retinal Diseases	It generates images with reasonable detail. Using these images improved the model’s performance. It added diversity, helping to improve the model’s generalization performance.
Generative adversarial network-based deep learning approach in classification of retinal conditions with optical coherence tomography images	Sun, et al.	22 November 2022	StyleGAN2-ADA	ROC)/AUC, Sensitivity, Specificity, Precision, ACC, F1, MCC	Retinal Diseases	Accuracy improved when the models were trained with synthetic data to balance the dataset.
Synthesizing multi-frame high-resolution fluorescein angiography images from retinal fundus images using generative adversarial networks	Li, et al.	21 February 2023	CycleGAN	SSIM, PSNR, NCC	Retinal Diseases	They generate low-quality images. They synthesize these images to generate high-quality, high-resolution images.
Fundus Image-Label Pairs Synthesis and Retinopathy Screening via GANs With Class-Imbalanced Semi-Supervised Learning	Xie, et al.	27 March 2023	CISSL-GAN	FID, Precision, Recall, Similarity	Retinopathy	Used to simultaneously improve the generation of class conditions of the fundus image and the classification performance in a typical scenario of insufficient and unbalanced labels.
SynthEye: Investigating the Impact of Synthetic Data on Artificial Intelligence-assisted Gene Diagnosis of Inherited Retinal Disease	Yoga Advaith Veturi, et al.	1 June 2023	StyleGAN2	Euclidean Distance, ROC/AUC, BRISQUE, LPIPS	Inherited Retinal Disease	It generates realistic images that have led to some of them being misjudged by clinical experts. The performance of the classifiers did not worsen with the data generated. Performance was like using only real data.
Synthetic artificial intelligence using generative adversarial network for retinal imaging in detection of age-related macular degeneration	Wang, et al.	22 June 2023	StyleGAN2	SSIM, ROC/AUC, k score, ACC, Sensitivity, Specificity	Age-related Macular Degeneration	Generates images with robust AMD lesions despite the Dataset having few images for initial training. They would easily confuse experts in distinguishing the generated data from the real thing.
Generating OCT B-Scan DME images using optimized Generative Adversarial Networks (GANs)	Tripathi, et al.	2 August 2023	StyleGAN2	FID, MSE	Diabetic Macular Edema	It generated realistic images, very similar to real ones.
ROP-GAN: an image synthesis method for retinopathy of prematurity based on generative adversarial network.	Hou, et al.	6 October 2023	ROP-GAN	FID, IS, Classification Task with deep learning models	Retinopathy of Prematurity	Generates synthetic images of classes with little data. Generates fundus images in several stages. Improved the accuracy of the classifiers by adding the generated data to their training.
Multi-Layer Preprocessing and U-Net with Residual Attention Block for Retinal Blood Vessel Segmentation	Alsayat, et al.	30 October 2023	LDM	PSNR, SSIM, FID, IS	Retinal Blood Vessel	It generates high-quality, diverse data to add to the training data set.
Automated detection of crystalline retinopathy via fundus photography using multistage generative adversarial networks	Choi, et al.	1 December 2023	CycleGAN	ROC/AUC, Sensitivity, Specificity	Crystalline Retinopathy	It generates realistic images of the pathology. The accuracy of the model improved with the use of these synthetic data in training.
Robust Deep Learning for Eye Fundus Images: Bridging Real and Synthetic Data for Enhancing Generalization	Oliveira, et al.	1 January 2024	StyleGAN2-ADA	FID, SSIM, PSNR, Retinal specialists	Age-related Macular Degeneration	Generates synthetic images similar to real ones with high resolution from a few images. Specialists found it difficult to distinguish the generated images from the real ones. Using the generated images with the real ones to train the models improved their accuracy.
The Role of Fundus Imaging and GAN in Diabetic Retinopathy Classification using VGG19	Kabilan, et al.	1 January 2024	DCGAN	SSIM, FID	Diabetic Retinopathy (DR)	Generates images to overcome data scarcity.
Development of a generative deep learning model to improve epiretinal membrane detection in fundus photography	Choi, et al.	1 January 2024	StyleGAN2	ROC/AUC, Sensitivity, Specificity, NPV, PPV	Epiretinal Membrane	It generates more realistic images (compared to other methods), with better quality. Using these data to train models improves the performance of disease detection.
Transfer Learning and Interpretable Analysis-Based Quality Assessment of Synthetic Optical Coherence Tomography Images by CGAN Model for Retinal Diseases	Han, et al.	13 January 2024	CGAN	ACC, Precision, F1, Grad-CAM, Occlusion sensitivity, LIME	Retinal Diseases	They generate images that are very similar to real images. They are comparable to images of real retinal diseases, according to the metrics obtained.
Digital ray: enhancing cataractous fundus images using style transfer generative adversarial networks to improve retinopathy detection	Lixue Liu, et al.	5 June 2024	CycleGAN	FID, ROC/AUC, KID	Retinopathy	Generated images with realistic features and better quality. Improved classifier accuracy with the use of synthetic data in training.
Revolutionizing diabetic retinopathy diagnosis through advanced deep learning techniques: Harnessing the power of GAN model with transfer learning and the DiaGAN-CNN model	Shoaib, et al.	29 August 2024	DiaGAN	ROC/AUC	Diabetic Retinopathy	It generated realistic images, similar but not the same as the originals. It improved training results in metrics such as accuracy and precision.
VSG-GAN: A high-fidelity image synthesis method with semantic manipulation in retinal fundus image	Liu, et al.	3 September 2024	VSG-GAN	KID, FID, IS, SSIM	Diabetic Retinopathy	Generates images with high fidelity. Improved the accuracy of classifiers when using real images and those generated during training. Expands the dataset through efficient data augmentation.

References

Ben-Yosef, T. Inherited Retinal Diseases. Int. J. Mol. Sci. 2022, 23, 13467. [Google Scholar] [CrossRef] [PubMed]
Hanany, M.; Rivolta, C.; Sharon, D. Worldwide carrier frequency and genetic prevalence of autosomal recessive inherited retinal diseases. Proc. Natl. Acad. Sci. USA 2020, 117, 2710–2716. [Google Scholar] [CrossRef] [PubMed]
Bessant, D. Molecular genetics and prospects for therapy of the inherited retinal dystrophies. Curr. Opin. Genet. Dev. 2001, 11, 307–316. [Google Scholar] [CrossRef] [PubMed]
Miere, A.; Le Meur, T.; Bitton, K.; Pallone, C.; Semoun, O.; Capuano, V.; Capuano, V.; Colantuono, D.; Taibouni, K.; Chenoune, Y.; et al. Deep Learning-Based Classification of Inherited Retinal Diseases Using Fundus Autofluorescence. J. Clin. Med. 2020, 9, 3303. [Google Scholar] [CrossRef]
Sahel, J.-A.; Marazova, K.; Audo, I. Clinical Characteristics and Current Therapies for Inherited Retinal Degener-ations. Cold Spring Harb. Perspect. Med. 2015, 5, a017111. [Google Scholar] [CrossRef]
Schmidt-Erfurth, U.; Sadeghipour, A.; Gerendas, B.S.; Waldstein, S.M.; Bogunović, H. Artificial intelligence in retina. Prog. Retin. Eye Res. 2018, 67, 1–29. [Google Scholar] [CrossRef]
Jiang, F.; Jiang, Y.; Zhi, H.; Dong, Y.; Li, H.; Ma, S.; Wang, Y.; Dong, Q.; Shen, H.; Wang, Y. Artificial intelligence in healthcare: Past, present and fu-ture. Stroke Vasc. Neurol. 2017, 2, 230–243. [Google Scholar] [CrossRef]
Guo, C.; Yu, M.; Li, J. Prediction of different eye diseases based on fundus photography via deep transfer learning. J. Clin. Med. 2021, 10, 5481. [Google Scholar] [CrossRef]
Liu, T.Y.A.; Ling, C.; Hahn, L.; Jones, C.K.; Boon, C.J.; Singh, M.S. Prediction of visual impairment in retinitis pigmentosa using deep learning and multimodal fundus images. Br. J. Ophthalmol. 2023, 107, 1484–1489. [Google Scholar] [CrossRef]
Xie, Y.; Wan, Q.; Xie, H.; Xu, Y.; Wang, T.; Wang, S.; Lei, B. Fundus Image-Label Pairs Synthesis and Retinopathy Screening via GANs With Class-Imbalanced Semi-Supervised Learning. IEEE Trans. Med. Imaging 2023, 42, 2714–2725. [Google Scholar] [CrossRef]
Perez, L.; Wang, J. The Effectiveness of Data Augmentation in Image Classification using Deep Learning. arXiv 2017, arXiv:1712.04621. [Google Scholar]
Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. arXiv 2014, arXiv:1406.2661. [Google Scholar] [CrossRef]
Kupas, D.; Harangi, B. Solving the problem of imbalanced dataset with synthetic image generation for cell classification using deep learning. In Proceedings of the 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Online, 1–5 November 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 2981–2984. [Google Scholar] [CrossRef]
Veturi, Y.A.; Woof, W.; Lazebnik, T.; Moghul, I.; Woodward-Court, P.; Wagner, S.K.; de Guimarães, T.A.C.; Varela, M.D.; Liefers, B.; Patel, P.J.; et al. SynthEye: Investigating the Impact of Synthetic Data on Artificial Intelligence-assisted Gene Diagnosis of Inherited Retinal Disease. Ophthalmol. Sci. 2023, 3, 100258. [Google Scholar] [CrossRef]
Chen, Y.; Long, J.; Guo, J. RF-GANs: A Method to Synthesize Retinal Fundus Images Based on Generative Ad-versarial Network. Comput. Intell. Neurosci. 2021, 2021, 3812865. [Google Scholar] [CrossRef]
Hou, N.; Shi, J.; Ding, X.; Nie, C.; Wang, C.; Wan, J. ROP-GAN: An image synthesis method for retinopathy of prematurity based on generative adversarial network. Phys. Med. Biol. 2023, 68, 205016. [Google Scholar] [CrossRef]
Ruthotto, L.; Haber, E. An introduction to deep generative modeling. GAMM-Mitteilungen 2021, 44, e202100008. [Google Scholar] [CrossRef]
Menti, E.; Bonaldi, L.; Ballerini, L.; Ruggeri, A.; Trucco, E. Automatic generation of synthetic retinal fundus images: Vascular network. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2016; Volume 9968 LNCS, pp. 167–176. [Google Scholar] [CrossRef]
Costa, P.; Galdran, A.; Meyer, M.I.; Abràmoff, M.D.; Niemeijer, M.; Mendonça, A.M.; Campilho, A. Towards Adversarial Retinal Image Synthesis. arXiv 2017, arXiv:1701.08974. [Google Scholar]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
PROSPERO. International Prospective Register of Systematic Reviews. Centre for Reviews and Dissemination, University of York. Available online: https://www.crd.york.ac.uk/prospero/ (accessed on 2 March 2025).
Arsalan, M.; Baek, N.R.; Owais, M.; Mahmood, T.; Park, K.R. Deep learning-based detection of pigment signs for analysis and diagnosis of retinitis pigmentosa. Sensors 2020, 20, 3454. [Google Scholar] [CrossRef]
Masumoto, H.; Tabuchi, H.; Nakakura, S.; Ohsugi, H.; Enno, H.; Ishitobi, N.; Ohsugi, E.; Mitamura, Y. Accuracy of a deep convolu-tional neural network in detection of retinitis pigmentosa on ultrawide-field images. PeerJ 2019, 7, e6900. [Google Scholar] [CrossRef]
Mou, L.; Zhao, Y.; Fu, H.; Liu, Y.; Cheng, J.; Zheng, Y.; Su, P.; Yang, J.; Chen, L.; Frangi, A.F.; et al. CS2-Net: Deep learning segmentation of curvilinear structures in medical imaging. Med. Image Anal. 2021, 67, 101874. [Google Scholar] [CrossRef]
JBI. JBI Manual for Evidence Synthesis; JBI: Adelaide, Australia, 2024. [Google Scholar] [CrossRef]
Yoo, T.K.; Choi, J.Y.; Kim, H.K. Feasibility study to improve deep learning in OCT diagnosis of rare retinal diseases with few-shot classification. Med. Biol. Eng. Comput. 2021, 59, 401–415. [Google Scholar] [CrossRef] [PubMed]
Oliveira, G.C.; Rosa, G.H.; Pedronette, D.C.G.; Papa, J.P.; Kumar, H.; Passos, L.A.; Kumar, D. Robust deep learning for eye fundus images: Bridging real and synthetic data for enhancing generalization. Biomed. Signal Process. Control 2024, 94, 106263. [Google Scholar] [CrossRef]
Wang, Z.; Lim, G.; Ng, W.Y.; Tan, T.E.; Lim, J.; Lim, S.H.; Foo, V.; Lim, J.; Sinisterra, L.G.; Zheng, F.; et al. Synthetic artificial intelligence using generative adversarial network for retinal imaging in detection of age-related macular degeneration. Front. Med. 2023, 10, 1184892. [Google Scholar] [CrossRef]
Kim, M.; Kim, Y.N.; Jang, M.; Hwang, J.; Kim, H.K.; Yoon, S.C.; Kim, Y.J.; Kim, N. Synthesizing realistic high-resolution reti-na image by style-based generative adversarial network and its utilization. Sci. Rep. 2022, 12, 17307. [Google Scholar] [CrossRef]
Burlina, P.M.; Joshi, N.; Pacheco, K.D.; Liu, T.Y.A.; Bressler, N.M. Assessment of Deep Generative Models for High-Resolution Synthetic Retinal Image Generation of Age-Related Macular Degeneration. JAMA Ophthalmol. 2019, 137, 258. [Google Scholar] [CrossRef]
He, X.; Fang, L.; Rabbani, H.; Chen, X.; Liu, Z. Retinal optical coherence tomography image classification with label smoothing generative adversarial network. Neurocomputing 2020, 405, 37–47. [Google Scholar] [CrossRef]
Wu, M.; Cai, X.; Chen, Q.; Ji, Z.; Niu, S.; Leng, T.; Rubin, D.L.; Park, H. Geographic atrophy segmentation in SD-OCT images using synthesized fundus autofluorescence imaging. Comput. Methods Programs Biomed. 2019, 182, 105101. [Google Scholar] [CrossRef]
Choi, E.Y.; Han, S.H.; Ryu, I.H.; Kim, J.K.; Lee, I.S.; Han, E.; Kim, H.; Choi, J.Y.; Yoo, T.K. Automated detection of crystalline retinopathy via fundus photography using multistage generative adversarial networks. Biocybern. Biomed. Eng. 2023, 43, 725–735. [Google Scholar] [CrossRef]
Kamran, S.A.; Hossain, K.F.; Tavakkoli, A.; Zuckerbrod, S.L.; Sanders, K.M.; Baker, S.A. RV-GAN: Segmenting Retinal Vascular Structure in Fundus Photographs using a Novel Multi-scale Generative Adversarial Network. arXiv 2021, arXiv:2101.00535. [Google Scholar] [CrossRef]
Burlina, P.; Joshi, N.; Paul, W.; Pacheco, K.D.; Bressler, N.M. Addressing artificial intelligence bias in retinal diagnostics. Transl. Vis. Sci. Technol. 2021, 10, 1–13. [Google Scholar] [CrossRef] [PubMed]
Magister, L.C.; Arandjelovic, O. Generative Image Inpainting for Retinal Images using Generative Adversarial Networks. In Proceedings of the 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Online, 1–5 November 2021; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2021; pp. 2835–2838. [Google Scholar] [CrossRef]
Kabilan, C.; Madhesh Kumar, S.; Latha Selvi, G. The Role of Fundus Imaging and GAN in Diabetic Retinopathy Classification using VGG19. In Proceedings of the 3rd International Conference on Advances in Computing, Communication and Applied Informatics, ACCAI 2024, Chennai, India, 9–10 May 2024; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2024. [Google Scholar] [CrossRef]
Shoaib, M.R.; Emara, H.M.; Mubarak, A.S.; Omer, O.A.; Abd El-Samie, F.E.; Esmaiel, H. Revolutionizing dia-betic retinopathy diagnosis through advanced deep learning techniques: Harnessing the power of GAN model with transfer learning and the DiaGAN-CNN model. Biomed. Signal Process. Control. 2025, 99, 106790. [Google Scholar] [CrossRef]
Liu, J.; Xu, S.; He, P.; Wu, S.; Luo, X.; Deng, Y.; Huang, H. VSG-GAN: A High Fidelity Image Synthesis Method with Seman-tic Manipulation in Retinal Fundus Image. Biophys. J. 2024, 123, 2815–2829. [Google Scholar] [CrossRef]
Zhou, Y.; Wang, B.; He, X.; Cui, S.; Shao, L. DR-GAN: Conditional Generative Adversarial Network for Fi-ne-Grained Lesion Synthesis on Diabetic Retinopathy Images. IEEE J. Biomed. Health Inform. 2022, 26, 56–66. [Google Scholar] [CrossRef]
Tajmirriahi, M.; Kafieh, R.; Amini, Z.; Lakshminarayanan, V. A Dual-Discriminator Fourier Acquisitive GAN for Generating Retinal Optical Coherence Tomography Images. IEEE Trans. Instrum. Meas. 2022, 71, 1–8. [Google Scholar] [CrossRef]
Tripathi, A.; Kumar, P.; Mayya, V.; Tulsani, A. Generating OCT B-Scan DME images using optimized Generative Adversarial Networks (GANs). Heliyon 2023, 9, e18773. [Google Scholar] [CrossRef]
Choi, J.Y.; Ryu, I.H.; Kim, J.K.; Lee, I.S.; Yoo, T.K. Development of a generative deep learning model to improve epiretinal membrane detection in fundus photography. BMC Med. Inform. Decis. Mak. 2024, 24, 25. [Google Scholar] [CrossRef]
Guo, J.; Pang, Z.; Yang, F.; Shen, J.; Zhang, J. Study on the Method of Fundus Image Generation Based on Im-proved GAN. Math. Probl. Eng. 2020, 2020, 6309596. [Google Scholar] [CrossRef]
Alsayat, A.; Elmezain, M.; Alanazi, S.; Alruily, M.; Mostafa, A.M.; Said, W. Multi-Layer Preprocessing and U-Net with Residual Attention Block for Retinal Blood Vessel Segmentation. Diagnostics 2023, 13, 3364. [Google Scholar] [CrossRef]
HaoQi, G.; Ogawara, K. CGAN-based Synthetic Medical Image Augmentation between Retinal Fundus Images and Vessel Segmented Images. In Proceedings of the 2020 5th International Conference on Control and Robotics Engineering (ICCRE), Osaka, Japan, 24–26 April 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 218–223. [Google Scholar] [CrossRef]
Liu, L.; Hong, J.; Wu, Y.; Liu, S.; Wang, K.; Li, M.; Zhao, L.; Liu, Z.; Li, L.; Cui, T.; et al. Digital ray: Enhancing cataractous fundus images using style transfer generative adversarial networks to improve retinopathy detection. Br. J. Ophthalmol. 2024, 108, 1423–1429. [Google Scholar] [CrossRef]
Chen, J.S.; Coyner, A.S.; Chan, R.V.P.; Hartnett, M.E.; Moshfeghi, D.M.; Owen, L.A.; Kalpathy-Cramer, J.; Chiang, M.F.; Campbell, J.P. Deepfakes in Oph-thalmology: Applications and Realism of Synthetic Retinal Images from Generative Adversarial Networks. Ophthalmol. Sci. 2021, 1, 100079. [Google Scholar] [CrossRef] [PubMed]
Beji, A.; Blaiech, A.G.; Said, M.; Abdallah, A.B.; Bedoui, M.H. An innovative medical image synthesis based on dual GAN deep neural networks for improved segmentation quality. Appl. Intell. 2023, 53, 3381–3397. [Google Scholar] [CrossRef]
Sun, L.C.; Pao, S.I.; Huang, K.H.; Wei, C.Y.; Lin, K.F.; Chen, P.N. Generative adversarial network-based deep learning approach in classification of retinal conditions with optical coherence tomography images. Graefe’s Arch. Clin. Exp. Ophthalmol. 2023, 261, 1399–1412. [Google Scholar] [CrossRef] [PubMed]
Lei, H.; Tian, Z.; Xie, H.; Zhao, B.; Zeng, X.; Cao, J.; Liu, W.; Wang, J.; Zhang, G.; Wang, S.; et al. LAC-GAN: Lesion attention conditional GAN for Ul-tra-widefield image synthesis. Neural Netw. 2023, 158, 89–98. [Google Scholar] [CrossRef]
Li, P.; He, Y.; Wang, P.; Wang, J.; Shi, G.; Chen, Y. Synthesizing multi-frame high-resolution fluorescein angi-ography images from retinal fundus images using generative adversarial networks. BioMedical Eng. Online 2023, 22, 16. [Google Scholar] [CrossRef]
Han, K.; Yu, Y.; Lu, T. Transfer Learning and Interpretable Analysis-Based Quality Assessment of Synthetic Op-tical Coherence Tomography Images by CGAN Model for Retinal Diseases. Processes 2024, 12, 182. [Google Scholar] [CrossRef]
Jayasumana, S.; Ramalingam, S.; Veit, A.; Glasner, D.; Chakrabarti, A.; Kumar, S. Rethinking FID: Towards a Better Evaluation Metric for Image Generation. arXiv 2023, arXiv:2401.09603. [Google Scholar]
Nilsson, J.; Akenine-Möller, T. Understanding SSIM. arXiv 2020, arXiv:2006.13846. [Google Scholar]

Figure 1. PRISMA 2020 flow diagram for study selection.

Table 1. Search databases and reason.

Identifier	Inclusion Criteria
PubMed	It is focused on biomedicine and health, which meets the context of inherited retinal diseases.
IEEE Xplore	It is a database specializing in technology and engineering, covering the area of artificial intelligence.
Web of Science and Scopus	They are databases with a multidisciplinary scope and scientific impact, which can encompass studies covering both areas.

Table 2. Inclusion criteria.

Identifier	Inclusion Criteria
IC1	Journal and conference articles
IC2	Articles dealing with retinal diseases
IC3	Using generative models for data augmentation
IC4	Articles in English language
IC5	Articles published after 2019

Table 3. Exclusion criteria.

Identifier	Inclusion Criteria
EC1	Articles only use traditional data augmentation techniques
EC2	Articles use generative models but not for data augmentation
EC3	Performing data augmentation but not in ophthalmic diseases
EC4	Articles that are not possible to obtain
EC5	Articles do not cover the topic

Table 4. Year of publications included in this study.

Year	Publications	Percentage (%)
2019	2	6.25
2020	3	9.38
2021	6	18.75
2022	6	18.75
2023	8	25.00
2024	7	21.88
Total	32	100

Table 5. Architectures used in articles for this study.

Architectures	Acronym	Frequency	Percentage (%)
Deep Convolutional Generative Adversarial Network	DCGAN	5	13.89
Style-Based Generator Architecture for Generative Adversarial Networks 2	StyleGAN2	4	11.11
Cycle-Consistent Generative Adversarial Network	CycleGAN	4	11.11
Conditional Generative Adversarial Network	CGAN	3	8.33
Wasserstein Generative Adversarial Network	WGAN	3	8.33
Style-Based Generator Architecture for Generative Adversarial Networks	StyleGAN	2	5.56
Style-Based Generative Adversarial Network 2 with Adaptive Discriminator Augmentation	StyleGAN-ADA	2	5.56
Auxiliary Classifier Generative Adversarial Network	ACGAN	1	2.78
Class-imbalanced semi-supervised learning—Generative Adversarial Network	CISSL-GAN	1	2.78
Dimension Augmenter Generative Adversarial Network	DiAGAN	1	2.78
Diabetic Retinopathy Generative Adversarial Network	DR-GAN	1	2.78
Latent Diffusion Models	LDM	1	2.78
High-Definition Image-to-Image Translation with Conditional Generative Adversarial Networks	Pix2Pix HD	1	2.78
Progressive Growing of Generative Adversarial Network	ProGAN	1	2.78
Residual Attention Conditional Generative Adversarial Network	RA-CGAN	1	2.78
Retinopathy of Prematurity Generative Adversarial Network	ROP-GAN	1	2.78
RV-Generative Adversarial Network	RV-GAN	1	2.78
Variational Autoencoder	VAE	1	2.78
vessel and style guided generative adversarial network	VSG-GAN	1	2.78
Wasserstein Generative Adversarial Network with Gradient Penalty	WGAN-GP	1	2.78
Total		36	100

Table 6. Metrics used to evaluate the study’s results.

Metrics	Acronym	Frequency	Percentage (%)
Fréchet Inception Distance	FID	12	11.21
Receiver Operating Characteristic/Area Under the Curve	ROC/AUC	11	10.28
Accuracy	ACC	10	9.35
Structural Similarity Index Measure	SSIM	10	9.35
Retinal Specialists	-	8	7.48
Sensitivity, Specificity	-	8	7.48
Peak Signal-to-Noise Ratio	PSNR	5	4.67
F1-Score	F1	4	3.74
Inception Score	IS	4	3.74
Precision	-	4	3.74
Euclidean Distance	-	2	1.87
Kernel Inception Distance	KID	2	1.87
Mean Squared Error	MSE	2	1.87
Signal-to-Noise Ratios	SNR	2	1.87
Sliced Wasserstein Distance	SWD	2	1.87
Precision and Recall Curve	PR	2	1.87
Blind/Referenceless Image Spatial Quality Evaluator	BRISQUE	1	0.93
Classification Task with Deep Learning Models	-	1	0.93
Cohen’s Kappa Score	K Score	1	0.93
Gradient-weighted Class Activation Mapping	Grad-CAM	1	0.93
Learned Perceptual Image Patch Similarity	LPIPS	1	0.93
Local Interpretable Model-Agnostic Explanations	LIME	1	0.93
Mean Intersection Over Union	Mean-IoU	1	0.93
Multiscale Structural Similarity Index Measure	MS-SSIM	1	0.93
Normalized Cross-Correlation	NCC	1	0.93
Occlusion Sensitivity	-	1	0.93
Recall	-	1	0.93
Scale Invariant Feature Transform	SIFT	1	0.93
Sharpness Difference	SD	1	0.93
Similarity	-	1	0.93
Dice Coefficient	-	1	0.93
Negative Predictive Value	NPV	1	0.93
Positive Predictive Value	PPV	1	0.93
Oriented FAST and Rotated BRIEF	ORB	1	0.93
Oriented FAST and Rotated BRIEF	ORB	1	0.93
Matthews Correlation Coefficient	MCC	1	0.93
Evaluation of the Coherence of the Inpainted Image	-	1	0.93
Total		107	100

Table 7. Context identified in the studies.

Study Context	Frequency	Percentage (%)
Diabetic Retinopathy	9	28.13
Age-Related Macular Degeneration	6	18.75
Retinal Diseases	5	15.63
Inherited Retinal Disease	2	6.25
Retinopathy	2	6.25
Retinopathy of Prematurity	2	6.25
Retinal Blood Vessel	2	6.25
Crystalline Retinopathy	1	3.13
Degenerative Retinal Diseases	1	3.13
Epiretinal Membrane	1	3.13
Hard Exudate	1	3.13
Total	32	100

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Machado, J.; Marta, A.; Mestre, P.; Beirão, J.M.; Cunha, A. Data Augmentation with Generative Methods for Inherited Retinal Diseases: A Systematic Review. Appl. Sci. 2025, 15, 3084. https://doi.org/10.3390/app15063084

AMA Style

Machado J, Marta A, Mestre P, Beirão JM, Cunha A. Data Augmentation with Generative Methods for Inherited Retinal Diseases: A Systematic Review. Applied Sciences. 2025; 15(6):3084. https://doi.org/10.3390/app15063084

Chicago/Turabian Style

Machado, Jorge, Ana Marta, Pedro Mestre, João Melo Beirão, and António Cunha. 2025. "Data Augmentation with Generative Methods for Inherited Retinal Diseases: A Systematic Review" Applied Sciences 15, no. 6: 3084. https://doi.org/10.3390/app15063084

APA Style

Machado, J., Marta, A., Mestre, P., Beirão, J. M., & Cunha, A. (2025). Data Augmentation with Generative Methods for Inherited Retinal Diseases: A Systematic Review. Applied Sciences, 15(6), 3084. https://doi.org/10.3390/app15063084

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Data Augmentation with Generative Methods for Inherited Retinal Diseases: A Systematic Review

Abstract

1. Introduction

2. Materials and Methods

2.1. Related Work

2.2. Search Question

2.3. Search Strategy

2.4. Data Extraction and Synthesis

3. Results

Characteristics of the Included Studies

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

Architectures Analyses

Appendix B

Research Specification

Appendix C

Results of Included Studies

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI