ABSTRACT
Electronic Health Records (EHR) narratives are a rich source of information, embedding high-resolution information of value to secondary research use. However, because the EHRs are mostly in natural language free-text and highly ambiguity-ridden, many natural language processing algorithms have been devised around them to extract meaningful structured information about clinical entities. The performance of the algorithms however, largely varies depending on the training dataset as well as the effectiveness of the use of background knowledge to steer the learning process.
In this paper we study the impact of initializing the training of a neural network natural language processing algorithm with pre-defined clinical word embeddings to improve feature extraction and relationship classification between entities. We add our embedding framework to a bi-directional long short-term memory (Bi-LSTM) neural network, and further study the effect of using attention weights in neural networks for sequence labelling tasks to extract knowledge of Adverse Drug Reactions (ADRs). We incorporate unsupervised word embeddings using Word2Vec and GloVe from widely available medical resources such as Multiparameter Intelligent Monitoring in Intensive Care (MIMIC) II corpora, Unified Medical Language System (UMLS) as well as embed pharmaco lexicon from available EHRs. Our algorithm, implemented using two datasets, shows that our architecture outperforms baseline Bi-LSTM or Bi-LSTM networks using linear chain and Skip-Chain conditional random fields (CRF).
- A. R. Aronson. 2001. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc AMIA Symp (2001), 17--21. http: //view.ncbi.nlm.nih.gov/pubmed/11825149Google Scholar
- Jan Chorowski, Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. End-to-end continuous speech recognition using attention-based recurrent NN: First results. arXiv preprint arXiv:1412.1602 (2014).Google Scholar
- John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12, Jul (2011), 2121--2159. Google ScholarDigital Library
- C. Friedman. 2000. A broad-coverage natural language processing system. Proceedings of the AMIA Symposium (2000), 270--4.Google Scholar
- Harsha Gurulingappa, Abdul Mateen-Rajpu, and Luca Toldo. 2012. Extraction of potential adverse drug events from medical case reports. Journal of biomedical semantics 3, 1 (2012), 15.Google ScholarCross Ref
- Aron Henriksson, Maria Kvist, Hercules Dalianis, and Martin Duneld. 2015. Identifying adverse drug event information in clinical notes with distributional semantic representations of context. Journal of biomedical informatics 57 (2015), 333-- Google ScholarDigital Library
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780. Google ScholarDigital Library
- Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF Models for Sequence Tagging. CoRR abs/1508.01991 (2015). http://arxiv.org/abs/1508.01991Google Scholar
- Ehtesham Iqbal, Robbie Mallah, Richard George Jackson, Michael Ball, Zina M Ibrahim, Matthew Broadbent, Olubanke Dzahini, Robert Stewart, Caroline, Johnston, and Richard JB Dobson. 2015. Identification of Adverse Drug Events from Free Text Electronic Patient Records and Information in a Large Mental Health Case Register. PLoS One 10, 8 (2015), e0134208.Google ScholarCross Ref
- Abhyuday N Jagannatha and Hong Yu. 2016. Structured prediction models for RNN based sequence labeling in clinical text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing, Vol. 2016. NIH Public Access, 856.Google ScholarCross Ref
- Michael Kuhn, Monica Campillos, Ivica Letunic, Lars Juhl Jensen, and Peer Bork. 2010. A side effect resource to capture phenotypic effects of drugs. Molecular systems biology 6, 1 (2010), 343.Google Scholar
- John Lafferty, Andrew McCallum, Fernando Pereira, and others. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the eighteenth international conference on machine learning, ICML, Vol. 1. 282--289. Google ScholarDigital Library
- Chen Li, Runqing Song, Maria Liakata, Andreas Vlachos, Stephanie Seneff, and Xiangrong Zhang. 2015. Using word embedding for bio-event extraction. In Proceedings of the 2015 Workshop on Biomedical Natural Language Processing (BioNLP 2015). Stroudsburg, PA: Association for Computational Linguistics. 121-- 126.Google ScholarCross Ref
- Christopher Longhurst, Robert Harrington, and Nigam Shah. 2014. A Green Button For Using Aggregate Patient Data At The Point Of Care. Health Affairs 33, 7 (2014), 1229--1235.Google ScholarCross Ref
- Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119. Google ScholarDigital Library
- Yifan Nie, Wenge Rong, Yiyuan Zhang, Yuanxin Ouyang, and Zhang Xiong. 2015. Embedding assisted prediction architecture for event trigger identification. Journal of bioinformatics and computational biology 13, 03 (2015), 1541001.Google ScholarCross Ref
- Azadeh Nikfarjam, Abeed Sarker, Karen O'Connor, Rachel Ginn, and Graciela Gonzalez. 2015. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features. Journal of the American Medical Informatics Association (2015), ocu041.Google Scholar
- Naoaki Okazaki. 2007. CRFsuite: a fast implementation of conditional random fields (CRFs). (2007).Google Scholar
- G.K. Savova, J.J. Masanz, P.V. Ogren, J. Zheng, S. Sohn, K.C. Kipper-Schuler, and C.G. Chute. 2010. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. Journal of the American Medical Informatics Association 17, 5 (2010), 507--513.Google ScholarCross Ref
Index Terms
- Improving RNN with Attention and Embedding for Adverse Drug Reactions
Recommendations
Arabic Named Entity Recognition Using Clustered Word Embedding
Computational Linguistics and Intelligent Text ProcessingAbstractNamed Entity Recognition in Arabic is a challenging topic because of morphological and lexical richness of Arabic. In this paper, we propose an Arabic NER system that is based on word embedding. Word embedding hold semantic information about the ...
Urdu Named Entity Recognition with Attention Bi-LSTM-CRF Model
Advances in Computational IntelligenceAbstractThe named entity recognition (NER) task is a challenging problem in natural language processing (NLP), especially for languages with very few annotated corpora such as Urdu. In this paper we proposed an Attention-Bi-LSTM-CRF method and applied it ...
Identifying adverse drug reaction entities from social media with adversarial transfer learning model
AbstractIdentifying adverse drug reaction (ADR) entities from texts is a crucial task for pharmacology, and it is the basis for the ADR relation extraction task. The publicly available resources on this task include PubMed abstracts, social ...
Comments