Skip to main content

Data Augmentation Techniques to Improve Metabolomic Analysis in Niemann-Pick Type C Disease

  • Conference paper
  • First Online:
Computational Science – ICCS 2022 (ICCS 2022)

Abstract

Niemann-Pick Class 1 (NPC1) disease is a rare and neurodegenerative disease, and often metabolomics datasets of NPC1 patients are limited in the number of samples and severely imbalanced. In order to improve the predictive capability and identify new biomarkers in an NPC1 disease urinary dataset, data augmentation (DA) techniques based on computational intelligence are employed to create additional synthetic samples. This paper presents DA techniques, based on the addition of noise, on oversampling techniques and using conditional generative adversarial networks, to evaluate their predictive capacities on a set of Nuclear Magnetic Resonance (NMR) profiles of urine samples. Prediction results obtained show increases in sensitivity (30%) and in F\(_{1}\) score (20%). In addition, multivariate data analysis and variable importance in projection scores have been applied. These analyses show the ability of the DA methods to replicate the information of the metabolites and determined that selected metabolites (such as 3-aminoisobutyrate, 3-hidroxivaleric, quinolinate and trimethylamine) may be valuable biomarkers for the diagnosis of NPC1 disease.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/a:1010933404324

    Article  MATH  Google Scholar 

  2. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002). https://doi.org/10.1613/jair.953

    Article  MATH  Google Scholar 

  3. Chong, J., et al.: MetaboAnalyst 4.0: towards more transparent and integrative metabolomics analysis. Nucl. Acids Res. 46(W1), W486–W494 (2018). https://doi.org/10.1093/nar/gky310

  4. Cougnoux, A., et al.: Necroptosis in Niemann-Pick disease, type C1: a potential therapeutic target. Cell Death Dis. 7(3), e2147–e2147 (2016). https://doi.org/10.1038/cddis.2016.16

  5. Douzas, G., Bacao, F.: Effective data generation for imbalanced learning using conditional generative adversarial networks. Expert Syst. Appl. 91, 464–471 (2018). https://doi.org/10.1016/j.eswa.2017.09.030

  6. Frid-Adar, M., Diamant, I., Klang, E., Amitai, M., Goldberger, J., Greenspan, H.: GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing 321, 321–331 (2018). https://doi.org/10.1016/j.neucom.2018.09.013

  7. García-Ordás, M.T., Benavides, C., Benítez-Andrades, J.A., Alaiz-Moretón, H., García-Rodríguez, I.: Diabetes detection using deep learning techniques with oversampling and feature augmentation. Comput. Meth. Programs Biomed. 202, 105968 (2021). https://doi.org/10.1016/j.cmpb.2021.105968

    Article  Google Scholar 

  8. Goodfellow, I., et al.: Generative Adversarial Nets. In: Advances in Neural Information Processing Systems, vol. 3, pp. 2672–2680 (2014). https://doi.org/10.1145/3422622

  9. He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/cvpr.2016.90

  10. Liu, Y., Zhou, Y., Liu, X., Dong, F., Wang, C., Wang, Z.: Wasserstein GAN-based small-sample augmentation for new-generation artificial intelligence: a case study of cancer-staging data in biology. Engineering 5(1), 156–163 (2019). https://doi.org/10.1016/j.eng.2018.11.018

  11. Lloyd-Evans, E., et al.: Niemann-Pick disease type C1 is a sphingosine storage disease that causes deregulation of lysosomal calcium. Nat. Med. 14(11), 1247 (2008). https://doi.org/10.1038/nm.1876

  12. Marouf, M., et al.: Realistic in silico generation and augmentation of single-cell RNA-seq data using generative adversarial networks. Nat. Commun. 11(1), 1–12 (2020). https://doi.org/10.1038/s41467-019-14018-z

  13. Marshall, D.D., Powers, R.: Beyond the paradigm: combining mass spectrometry and nuclear magnetic resonance for metabolomics. Prog. Nucl. Magn. Reson. Spectrosc. 100, 1–16 (2017). https://doi.org/10.1016/j.pnmrs.2017.01.001

    Article  Google Scholar 

  14. Marzullo, A., Moccia, S., Catellani, M., Calimeri, F., De Momi, E.: Towards realistic laparoscopic image generation using image-domain translation. Comput. Methods Programs Biomed. 200, 105834 (2021). https://doi.org/10.1016/j.cmpb.2020.105834

    Article  Google Scholar 

  15. Mirza, M., Osindero, S.: Conditional Generative Adversarial Nets. CoRR abs/1411.1784, November 2014. https://arxiv.org/abs/1411.1784

  16. Moreno-Barea, F.J., Jerez, J.M., Franco, L.: Improving classification accuracy using data augmentation on small data sets. Expert Syst. Appl. 161, 113696 (2020). https://doi.org/10.1016/j.eswa.2020.113696

    Article  Google Scholar 

  17. Moreno-Barea, F.J., Strazzera, F., Jerez, J.M., Urda, D., Franco, L.: Forward Noise Adjustment Scheme for Data Augmentation. In: IEEE Symposium Series on Computational Intelligence (IEEE SSCI 2018), pp. 728–734 (2018). https://doi.org/10.1109/ssci.2018.8628917

  18. Percival, B.C., Latour, Y.L., Tifft, C.J., Grootveld, M.: Rapid identification of new biomarkers for the classification of GM1 Type 2 Gangliosidosis using an unbiased 1H NMR-linked metabolomics strategy. Cells 10(3), 572 (2021). https://doi.org/10.3390/cells10030572

    Article  Google Scholar 

  19. Platt, F.M., d’Azzo, A., Davidson, B.L., Neufeld, E.F., Tifft, C.J.: Lysosomal storage diseases. Nat. Rev. Dis. Primers. 4(1), 1–25 (2018). https://doi.org/10.1038/s41572-018-0025-4

    Article  Google Scholar 

  20. Probert, F., et al.: NMR analysis reveals significant differences in the plasma metabolic profiles of Niemann Pick C1 patients, heterozygous carriers, and healthy controls. Sci. Rep. 7(1), 1–12 (2017). https://doi.org/10.1038/s41598-017-06264-2

  21. Ruiz-Rodado, V., et al.: 1H NMR-linked urinary metabolic profiling of Niemann-Pick Class C1 (NPC1) disease: identification of potential new biomarkers using correlated component regression (CCR) and genetic algorithm (GA) analysis strategies. Current Metabol. 2(2), 88–121 (2014). https://doi.org/10.2174/2213235X02666141112215616

  22. Vanier, M.T.: Niemann-Pick disease type C. Orphanet J. Rare Dis. 5(1), 1–18 (2010). https://doi.org/10.1186/1750-1172-5-16

    Article  Google Scholar 

  23. Waheed, A., Goyal, M., Gupta, D., Khanna, A., Al-Turjman, F., Pinheiro, P.R.: CovidGAN: data augmentation using auxiliary classifier GAN for improved COVID-19 detection. IEEE Access 8, 91916–91923 (2020). https://doi.org/10.1109/access.2020.2994762

    Article  Google Scholar 

  24. Winkler, M.B., et al.: Structural insight into eukaryotic sterol transport through Niemann-Pick type C proteins. Cell 179(2), 485–497 (2019). https://doi.org/10.1016/j.cell.2019.08.038

  25. Wishart, D.S., et al.: HMDB 4.0: the human metabolome database for 2018. Nucleic Acids Res. 46(D1), D608–D617 (2018). https://doi.org/10.1093/nar/gkx1089

  26. Zur, R.M., Jiang, Y., Pesce, L., Drukker, K.: Noise injection for training artificial neural networks: a comparison with weight decay and early stopping. Med. Phys. 36(10), 4810–4818 (2009). https://doi.org/10.1118/1.3213517

    Article  Google Scholar 

Download references

Acknowledgements

The authors acknowledge the support from MICINN (Spain) through grant TIN2017-88728-C2-1-R and PID2020-116898RB-I00, from Universidad de Málaga y Junta de Andalucía through grant UMA20-FEDERJA-045, and from IBIMA (all including FEDER funds).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francisco J. Moreno-Barea .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Moreno-Barea, F.J., Franco, L., Elizondo, D., Grootveld, M. (2022). Data Augmentation Techniques to Improve Metabolomic Analysis in Niemann-Pick Type C Disease. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2022. ICCS 2022. Lecture Notes in Computer Science, vol 13352. Springer, Cham. https://doi.org/10.1007/978-3-031-08757-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-08757-8_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-08756-1

  • Online ISBN: 978-3-031-08757-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics