Abstract
Niemann-Pick Class 1 (NPC1) disease is a rare and neurodegenerative disease, and often metabolomics datasets of NPC1 patients are limited in the number of samples and severely imbalanced. In order to improve the predictive capability and identify new biomarkers in an NPC1 disease urinary dataset, data augmentation (DA) techniques based on computational intelligence are employed to create additional synthetic samples. This paper presents DA techniques, based on the addition of noise, on oversampling techniques and using conditional generative adversarial networks, to evaluate their predictive capacities on a set of Nuclear Magnetic Resonance (NMR) profiles of urine samples. Prediction results obtained show increases in sensitivity (30%) and in F\(_{1}\) score (20%). In addition, multivariate data analysis and variable importance in projection scores have been applied. These analyses show the ability of the DA methods to replicate the information of the metabolites and determined that selected metabolites (such as 3-aminoisobutyrate, 3-hidroxivaleric, quinolinate and trimethylamine) may be valuable biomarkers for the diagnosis of NPC1 disease.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/a:1010933404324
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002). https://doi.org/10.1613/jair.953
Chong, J., et al.: MetaboAnalyst 4.0: towards more transparent and integrative metabolomics analysis. Nucl. Acids Res. 46(W1), W486–W494 (2018). https://doi.org/10.1093/nar/gky310
Cougnoux, A., et al.: Necroptosis in Niemann-Pick disease, type C1: a potential therapeutic target. Cell Death Dis. 7(3), e2147–e2147 (2016). https://doi.org/10.1038/cddis.2016.16
Douzas, G., Bacao, F.: Effective data generation for imbalanced learning using conditional generative adversarial networks. Expert Syst. Appl. 91, 464–471 (2018). https://doi.org/10.1016/j.eswa.2017.09.030
Frid-Adar, M., Diamant, I., Klang, E., Amitai, M., Goldberger, J., Greenspan, H.: GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing 321, 321–331 (2018). https://doi.org/10.1016/j.neucom.2018.09.013
García-Ordás, M.T., Benavides, C., Benítez-Andrades, J.A., Alaiz-Moretón, H., García-Rodríguez, I.: Diabetes detection using deep learning techniques with oversampling and feature augmentation. Comput. Meth. Programs Biomed. 202, 105968 (2021). https://doi.org/10.1016/j.cmpb.2021.105968
Goodfellow, I., et al.: Generative Adversarial Nets. In: Advances in Neural Information Processing Systems, vol. 3, pp. 2672–2680 (2014). https://doi.org/10.1145/3422622
He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/cvpr.2016.90
Liu, Y., Zhou, Y., Liu, X., Dong, F., Wang, C., Wang, Z.: Wasserstein GAN-based small-sample augmentation for new-generation artificial intelligence: a case study of cancer-staging data in biology. Engineering 5(1), 156–163 (2019). https://doi.org/10.1016/j.eng.2018.11.018
Lloyd-Evans, E., et al.: Niemann-Pick disease type C1 is a sphingosine storage disease that causes deregulation of lysosomal calcium. Nat. Med. 14(11), 1247 (2008). https://doi.org/10.1038/nm.1876
Marouf, M., et al.: Realistic in silico generation and augmentation of single-cell RNA-seq data using generative adversarial networks. Nat. Commun. 11(1), 1–12 (2020). https://doi.org/10.1038/s41467-019-14018-z
Marshall, D.D., Powers, R.: Beyond the paradigm: combining mass spectrometry and nuclear magnetic resonance for metabolomics. Prog. Nucl. Magn. Reson. Spectrosc. 100, 1–16 (2017). https://doi.org/10.1016/j.pnmrs.2017.01.001
Marzullo, A., Moccia, S., Catellani, M., Calimeri, F., De Momi, E.: Towards realistic laparoscopic image generation using image-domain translation. Comput. Methods Programs Biomed. 200, 105834 (2021). https://doi.org/10.1016/j.cmpb.2020.105834
Mirza, M., Osindero, S.: Conditional Generative Adversarial Nets. CoRR abs/1411.1784, November 2014. https://arxiv.org/abs/1411.1784
Moreno-Barea, F.J., Jerez, J.M., Franco, L.: Improving classification accuracy using data augmentation on small data sets. Expert Syst. Appl. 161, 113696 (2020). https://doi.org/10.1016/j.eswa.2020.113696
Moreno-Barea, F.J., Strazzera, F., Jerez, J.M., Urda, D., Franco, L.: Forward Noise Adjustment Scheme for Data Augmentation. In: IEEE Symposium Series on Computational Intelligence (IEEE SSCI 2018), pp. 728–734 (2018). https://doi.org/10.1109/ssci.2018.8628917
Percival, B.C., Latour, Y.L., Tifft, C.J., Grootveld, M.: Rapid identification of new biomarkers for the classification of GM1 Type 2 Gangliosidosis using an unbiased 1H NMR-linked metabolomics strategy. Cells 10(3), 572 (2021). https://doi.org/10.3390/cells10030572
Platt, F.M., d’Azzo, A., Davidson, B.L., Neufeld, E.F., Tifft, C.J.: Lysosomal storage diseases. Nat. Rev. Dis. Primers. 4(1), 1–25 (2018). https://doi.org/10.1038/s41572-018-0025-4
Probert, F., et al.: NMR analysis reveals significant differences in the plasma metabolic profiles of Niemann Pick C1 patients, heterozygous carriers, and healthy controls. Sci. Rep. 7(1), 1–12 (2017). https://doi.org/10.1038/s41598-017-06264-2
Ruiz-Rodado, V., et al.: 1H NMR-linked urinary metabolic profiling of Niemann-Pick Class C1 (NPC1) disease: identification of potential new biomarkers using correlated component regression (CCR) and genetic algorithm (GA) analysis strategies. Current Metabol. 2(2), 88–121 (2014). https://doi.org/10.2174/2213235X02666141112215616
Vanier, M.T.: Niemann-Pick disease type C. Orphanet J. Rare Dis. 5(1), 1–18 (2010). https://doi.org/10.1186/1750-1172-5-16
Waheed, A., Goyal, M., Gupta, D., Khanna, A., Al-Turjman, F., Pinheiro, P.R.: CovidGAN: data augmentation using auxiliary classifier GAN for improved COVID-19 detection. IEEE Access 8, 91916–91923 (2020). https://doi.org/10.1109/access.2020.2994762
Winkler, M.B., et al.: Structural insight into eukaryotic sterol transport through Niemann-Pick type C proteins. Cell 179(2), 485–497 (2019). https://doi.org/10.1016/j.cell.2019.08.038
Wishart, D.S., et al.: HMDB 4.0: the human metabolome database for 2018. Nucleic Acids Res. 46(D1), D608–D617 (2018). https://doi.org/10.1093/nar/gkx1089
Zur, R.M., Jiang, Y., Pesce, L., Drukker, K.: Noise injection for training artificial neural networks: a comparison with weight decay and early stopping. Med. Phys. 36(10), 4810–4818 (2009). https://doi.org/10.1118/1.3213517
Acknowledgements
The authors acknowledge the support from MICINN (Spain) through grant TIN2017-88728-C2-1-R and PID2020-116898RB-I00, from Universidad de Málaga y Junta de Andalucía through grant UMA20-FEDERJA-045, and from IBIMA (all including FEDER funds).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Moreno-Barea, F.J., Franco, L., Elizondo, D., Grootveld, M. (2022). Data Augmentation Techniques to Improve Metabolomic Analysis in Niemann-Pick Type C Disease. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2022. ICCS 2022. Lecture Notes in Computer Science, vol 13352. Springer, Cham. https://doi.org/10.1007/978-3-031-08757-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-08757-8_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08756-1
Online ISBN: 978-3-031-08757-8
eBook Packages: Computer ScienceComputer Science (R0)