Skip to main content

Representation Learning for Diagnostic Data

  • Conference paper
  • First Online:
Computer Information Systems and Industrial Management (CISIM 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12133))

  • 658 Accesses

Abstract

Representation learning algorithms have recently led to a significant progress in knowledge extraction from network structures. In this paper, a representation learning framework for the medical diagnosis domain is proposed. It is based on a heterogeneous network-based model of diagnostic data combined with an algorithm for learning latent node representation. Furthermore, a modification of metapath2vec algorithm is proposed for representation learning of heterogeneous networks. The proposed algorithm is compared with other representation learning approaches in two practical case studies: symptom/disease classification and disease prediction. A significant performance boost can be observed for these tasks, resulting from learning representations of domain data in a form of a heterogeneous network. It is also shown that in certain situations the modified algorithm improves the quality of learned embeddings compared to reference methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. multimetapath2vec. https://github.com/KarolAntczak/multimetapath2vec. Accessed 04 Feb 2020

  2. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv:13013781 Cs (2013)

  3. Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014, pp. 701–710. ACM Press (2014). https://doi.org/10.1145/2623330.2623732

  4. Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. arXiv:160700653 Cs Stat (2016)

  5. Peng, J., Guan, J., Shang, X.: Predicting Parkinson’s disease genes based on node2vec and autoencoder. Front. Genet. 10 (2019). https://doi.org/10.3389/fgene.2019.00226

  6. Shen, F., et al.: Constructing node embeddings for human phenotype ontology to assist phenotypic similarity measurement. In: 2018 IEEE International Conference on Healthcare Informatics Workshop (ICHI-W), pp. 29–33 (2018). https://doi.org/10.1109/ichi-w.2018.00011

  7. Kim, M., Baek, S.H., Song, M.: Relation extraction for biological pathway construction using node2vec. BMC Bioinform. 19, 206 (2018)

    Article  Google Scholar 

  8. Wu, T., et al.: Representation learning of EHR data via graph-based medical entity embedding. arXiv:191002574 Cs Stat (2019)

  9. Gao, Z., et al.: edge2vec: representation learning using edge semantics for biomedical knowledge discovery. arXiv:180902269 Cs (2019)

  10. Walczak, A., Paczkowski, M.: Medical data preprocessing for increased selectivity of diagnosis. Bio-algorithms Med.-Syst. 12, 39–43 (2016)

    Google Scholar 

  11. Budowa nowoczesnej aplikacji ICT do wsparcia badań naukowych w dziedzinie innowacyjnych metod diagnostyki i leczenia chorób cywilizacyjnych. https://isi.wat.edu.pl/sites/default/files/isi_ver8/proj_POIG.html. Accessed 04 Feb 2020

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Karol Antczak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Antczak, K. (2020). Representation Learning for Diagnostic Data. In: Saeed, K., Dvorský, J. (eds) Computer Information Systems and Industrial Management. CISIM 2020. Lecture Notes in Computer Science(), vol 12133. Springer, Cham. https://doi.org/10.1007/978-3-030-47679-3_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-47679-3_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-47678-6

  • Online ISBN: 978-3-030-47679-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics