Skip to main content

Generating Fake Data Using GANs for Anonymizing Healthcare Data

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 12108))

Abstract

EDITH is a project aiming to orchestrate an ecosystem of manipulation of reliable and safe data, applied to the field of health, proposing the creation of digital twins for personalised healthcare. This paper elaborates on a first approach about using Generative Adversarial Networks (GANs) for the generation of fake data, with the objective of anonymizing users information in the health sector. This is intended to create valuable data that can be used both, in educational and research areas, while avoiding the risk of a sensitive data leakage. Meanwhile GANs are mainly exploited on images and video frames, we are proposing to process raw data in the form of an image, so it can be managed through a GAN, then decoded back to the original data domain. The performance of this prototype has been demonstrated. Moreover, a novel research pathway has been opened so further developments are expected.

This research has been partially supported by the EDITH Research Project (PGC2018-102145-B-C22 (AEI/FEDER, UE)), funded by the Spanish Ministry of Science, Innovation and Universities.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://sci2s.ugr.es/keel/dataset.php?cod=67.

  2. 2.

    https://physionet.org/physiobank/database/ptbdb/.

  3. 3.

    For this case it is not necessary to isolate the 0 value to use it as a NULL element.

References

  1. Angulo, C., Ortega, J.A., Gonzalez-Abril, L.: Towards a healthcare digital twin. In: Sabater-Mir, J., Torra, V., Aguiló, I., González-Hidalgo, M. (eds.) Frontiers in Artificial Intelligence and Applications, vol. 319, pp. 312–315. IOS Press, Oxford (2019)

    Google Scholar 

  2. Barnett, S.A.: Convergence problems with generative adversarial networks (GANs) (2018). https://arxiv.org/abs/1806.11382

  3. Bruynseels, K., Santoni de Sio, F., van den Hoven, J.: Digital twins in health care: ethical implications of an emerging engineering paradigm. Front. Genet. 9, 31 (2018). https://doi.org/10.3389/fgene.2018.00031. https://www.frontiersin.org/article/10.3389/fgene.2018.00031

    Article  PubMed  PubMed Central  Google Scholar 

  4. El Emam, K., Arbuckle, L.: Anonymizing Health Data: Case Studies and Methods to Get You Started, 1st edn. O’Reilly Media, Inc., Newton (2013)

    Google Scholar 

  5. Feutry, C., Pablo Piantanida, Y.B., Duhamel, P.: Learning anonymized representations with adversarial neural networks. arXiv (2018). https://arxiv.org/abs/1802.09386

  6. Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 2672–2680. Curran Associates, Inc., New York (2014). http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf

    Google Scholar 

  7. Lakhey, M.: Generative adversarial networks demystified. Medium, Data Driven Investor (2019). https://medium.com/datadriveninvestor/gans-demystified-f057f5e32fc9

  8. Li, S.C.X., Jiang, B., Marlin, B.: MisGan: learning from incomplete data with generative adversarial networks. arXiv (2019). https://arxiv.org/abs/1902.09599

  9. Morillo, L.M.S., Gonzalez-Abril, L., Ramirez, J.A.O., De la Concepcion, M.A.A.: Low energy physical activity recognition system on smartphones. Sensors 15(3), 5163–5196 (2015). https://doi.org/10.3390/s150305163. https://www.mdpi.com/1424-8220/15/3/5163

    Article  Google Scholar 

  10. Piacentino, E., Angulo, C.: Anonymizing personal images using generative adversarial networks. In: Rojas, I., Guzman, F.M.O. (eds.) International Work-Conference on Bioinformatics and Biomedical Engineering, IWBBIO 2020, Granada, Spain, 6–8 May 2020. Copicentro Editorial (2020, submitted)

    Google Scholar 

  11. Shao, S., Wang, P., Yan, R.: Generative adversarial networks for data augmentation in machine fault diagnosis. Comput. Ind. 106, 85–93 (2019). https://doi.org/10.1016/j.compind.2019.01.001

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cecilio Angulo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Piacentino, E., Angulo, C. (2020). Generating Fake Data Using GANs for Anonymizing Healthcare Data. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2020. Lecture Notes in Computer Science(), vol 12108. Springer, Cham. https://doi.org/10.1007/978-3-030-45385-5_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-45385-5_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-45384-8

  • Online ISBN: 978-3-030-45385-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics