Skip to main content

RENET: A Deep Learning Approach for Extracting Gene-Disease Associations from Literature

  • Conference paper
  • First Online:
Book cover Research in Computational Molecular Biology (RECOMB 2019)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 11467))

Abstract

Over one million new biomedical articles are published every year. Efficient and accurate text-mining tools are urgently needed to automatically extract knowledge from these articles to support research and genetic testing. In particular, the extraction of gene-disease associations is mostly studied. However, existing text-mining tools for extracting gene-disease associations have limited capacity, as each sentence is considered separately. Our experiments show that the best existing tools, such as BeFree and DTMiner, achieve a precision of 48% and recall rate of 78% at most. In this study, we designed and implemented a deep learning approach, named RENET, which considers the correlation between the sentences in an article to extract gene-disease associations. Our method has significantly improved the precision and recall rate to 85.2% and 81.8%, respectively. The source code of RENET is available at https://bitbucket.org/alexwuhkucs/gda-extraction/src/master/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lu, Y.-F., Goldstein, D.B., Angrist, M., Cavalleri, G.: Personalized medicine and human genetic diversity. Cold Spring Harbor Perspect. Med. 4, a008581 (2014)

    Article  Google Scholar 

  2. Garraway, L.A., Verweij, J., Ballman, K.V.: Precision oncology: an overview. J. Clin. Oncol. 31(15), 1803–1805 (2013)

    Article  Google Scholar 

  3. Westergaard, D., Stærfeldt, H.-H., Tønsberg, C., Jensen, L.J., Brunak, S.: A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts. PLoS Comput. Biol. 14(2), e1005962 (2018)

    Article  Google Scholar 

  4. Wei, C.-H., Kao, H.-Y., Lu, Z.: PubTator: a web-based text mining tool for assisting biocuration. Nucleic Acids Res. 41(W1), W518–W522 (2013)

    Article  Google Scholar 

  5. Wang, Y., et al.: No association between bipolar disorder and syngr1 or synapsin II polymorphisms in the Han Chinese population. Psychiatry Res. 169(2), 167–168 (2009)

    Article  Google Scholar 

  6. Hakenberg, J., et al.: A SNPshot of PubMed to associate genetic variants with drugs, diseases, and adverse reactions. J. Biomed. Inf. 45(5), 842–850 (2012)

    Article  Google Scholar 

  7. Song, M., Kim, W.C., Lee, D., Heo, G.E., Kang, K.Y.: PKDE4 J: entity and relation extraction for public knowledge discovery. J. Biomed. Inf. 57, 320–332 (2015)

    Article  Google Scholar 

  8. Thompson, P., Ananiadou, S.: Extracting gene-disease relations from text to support biomarker discovery. In: Proceedings of the 2017 International Conference on Digital Health, pp. 180–189. ACM (2017)

    Google Scholar 

  9. Bundschus, M., Dejori, M., Stetter, M., Tresp, V., Kriegel, H.-P.: Extraction of semantic biomedical relations from text using conditional random fields. BMC Bioinf. 9(1), 207 (2008)

    Article  Google Scholar 

  10. Chun, H.-W., et al.: Extraction of gene-disease relations from Medline using domain dictionaries and machine learning. In: Biocomputing, pp. 4–15. World Scientific (2006)

    Google Scholar 

  11. Peng, Y., Lu, Z.: Deep learning for extracting protein-protein interactions from biomedical literature. arXiv preprint arXiv:1706.01556 (2017)

  12. Bravo, À., Piñero, J., Queralt-Rosinach, N., Rautschka, M., Furlong, L.I.: Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research. BMC Bioinf. 16(1), 55 (2015)

    Article  Google Scholar 

  13. Miwa, M., Bansal, M.: End-to-end relation extraction using LSTMS on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016)

  14. Nguyen, T.H., Grishman, R.: Relation extraction: perspective from convolutional neural networks. In: Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, pp. 39–48 (2015)

    Google Scholar 

  15. Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1422–1432 (2015)

    Google Scholar 

  16. Xu, D., et al.: DTMiner: identification of potential disease targets through biomedical literature mining. Bioinformatics 32(23), 3619–3626 (2016)

    Google Scholar 

  17. Roberts, R.J.: PubMed central: the GenBank of the published literature. Proc. Natl. Acad. Sci. U. S. A. 98(2), 381–382 (2001)

    Article  MathSciNet  Google Scholar 

  18. Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb), 1137–1155 (2003)

    MATH  Google Scholar 

  19. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  20. Tang, D., Qin, B., Liu, T.: Learning semantic representations of users and products for document level sentiment classification. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1014–1023 (2015)

    Google Scholar 

  21. Denil, M., Demiraj, A., Kalchbrenner, N., Blunsom, P., de Freitas, N.: Modelling, visualising and summarising documents with a single convolutional neural network. arXiv preprint arXiv:1408.5882 (2014)

  22. Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)

  23. Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)

    Article  Google Scholar 

  24. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  25. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceeding of the Conference on Empirical Methods in Natural Language Processing, pp. 1724–1734 (2014)

    Google Scholar 

  26. Graves, A., Jaitly, N., Mohamed, A.-R.: Hybrid speech recognition with deep bidirectional LSTM. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 273–278. IEEE (2013)

    Google Scholar 

  27. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  28. Piñero, J., et al.: DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res. 45(D1), D833–D839 (2016)

    Article  Google Scholar 

  29. Moen, S., Ananiadou, T.S.S.: Distributional semantics resources for biomedical text processing. In: Proceedings of the 5th International Symposium on Languages in Biology and Medicine, Tokyo, Japan, pp. 39–43 (2013)

    Google Scholar 

Download references

Acknowledgments

This work was supported by Hong Kong ITF Grant ITS/331/17FP and General Research Fund No. 27204518.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tak-Wah Lam .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, Y., Luo, R., Leung, H.C.M., Ting, HF., Lam, TW. (2019). RENET: A Deep Learning Approach for Extracting Gene-Disease Associations from Literature. In: Cowen, L. (eds) Research in Computational Molecular Biology. RECOMB 2019. Lecture Notes in Computer Science(), vol 11467. Springer, Cham. https://doi.org/10.1007/978-3-030-17083-7_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-17083-7_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-17082-0

  • Online ISBN: 978-3-030-17083-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics