Abstract
The success of neural networks is typically attributed to their ability to closely mimic relationships between features and labels observed in the training dataset. This, however, is only part of the answer: in addition to being fit to data, neural networks have been shown to be useful priors on the conditional distribution of labels given features and can be used as such even in the absence of trustworthy training labels. This feature of neural networks can be harnessed to train high quality models on low quality training data in tasks for which large high-quality ground truth datasets don’t exist. One of these problems is assertion classification in biomedical texts: discriminating between positive, negative and speculative statements about certain pathologies a patient may have. We present an assertion classification methodology based on recurrent neural networks, attention mechanism and two flavours of transfer learning (language modelling and heuristic annotation) that achieves state of the art results on MIMIC-CXR radiology reports.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
It’s C, B, C and probably C.
- 2.
Uzuner et. al. [1] refer to this type as alter-assertion.
- 3.
pun intended.
References
Uzuner, Ö., Zhang, X., Sibanda, T.: Machine learning and rule-based approaches to assertion classification. J. Am. Med. Inform. Assoc. 16(1), 109–115 (2009)
Goff, D.J., Loehfelm, T.W.: Automated radiology report summarization using an open-source natural language processing pipeline. J. Digit. Imaging 31(2), 185–192 (2018)
Bodenreider, O.: The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res. 32(suppl-1), D267–D270 (2004)
Chute, C.G., et al.: Mayo clinical text analysis and knowledge extraction system (cTAKES): architecture, component evaluation and applications. J. Am. Med. Inform. Assoc. 17(5), 507–513 (2010). https://doi.org/10.1136/jamia.2009.001560
Soldaini, L., Goharian, N.: Quickumls: a fast, unsupervised approach for medical concept extraction. In: MedIR Workshop, sigir (2016)
Aronson, A.R.: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In: Proceedings of the AMIA Symposium, p. 17. American Medical Informatics Association (2001)
Uzuner, Ö., South, B.R., Shen, S., DuVall, S.L.: 2010 i2b2/va challenge on concepts, assertions, and relations in clinical text. J. Am. Med. Inform. Assoc. 18(5), 552–556 (2011)
Miranda, E., Aryuni, M., Irwansyah, E.: A survey of medical image classification techniques. In: 2016 International Conference on Information Management and Technology (ICIMTech), pp. 56–61, November 2016.https://doi.org/10.1109/ICIMTech.2016.7930302
Lai, M.: Deep learning for medical image segmentation. arXiv preprint arXiv:1505.02000 (2015)
Litjens, G., et al.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)
Johnson, A.E., et al.: MIMIC-CXR: a large publicly available database of labeled chest radiographs. arXiv preprint arXiv:1901.07042 (2019)
Irvin, J., et al.: CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. arXiv preprint arXiv:1901.07031 (2019)
Rubin, J., Sanghavi, D., Zhao, C., Lee, K., Qadir, A., Xu-Wilson, M.: Large scale automated reading of frontal and lateral chest x-rays using dual convolutional neural networks. arXiv preprint arXiv:1804.07839 (2018)
Chapman, W.W., Bridewell, W., Hanbury, P., Cooper, G.F., Buchanan, B.G.: A simple algorithm for identifying negated findings and diseases in discharge summaries. J. Biomed. Inform. 34(5), 301–310 (2001)
Mehrabi, S., et al.: DEEPEN: a negation detection system for clinical text incorporating dependency relation into NegEx. J. Biomed. Inform. 54, 213–219 (2015)
Enger, M., Velldal, E., Øvrelid, L.: An open-source tool for negation detection: a maximum-margin approach. In: Proceedings of the Workshop Computational Semantics Beyond Events and Roles, pp. 64–69 (2017)
Peng, Y., Wang, X., Lu, L., Bagheri, M., Summers, R.M., Lu, Z.: NegBio: a high-performance tool for negation and uncertainty detection in radiology reports. CoRR abs/1712.05898 (2017). http://arxiv.org/abs/1712.05898
Shelmanov, A., Smirnov, I., Vishneva, E.: Information extraction from clinical texts in Russian. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue”, vol. 14, pp. 537–549 (2015)
Afzal, Z., Pons, E., Kang, N., Sturkenboom, M.C., Schuemie, M.J., Kors, J.A.: ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus. BMC Bioinform. 15(1), 373 (2014)
Sleator, D.D., Temperley, D.: Parsing English with a link grammar. arXiv preprint cmp-lg/9508004 (1995)
McCray, A.T., Srinivasan, S., Browne, A.C.: Lexical methods for managing variation in biomedical terminologies. In: Proceedings of the Annual Symposium on Computer Application in Medical Care, p. 235. American Medical Informatics Association (1994)
Chen, D., Manning, C.: A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 740–750 (2014)
Wu, S., et al.: Negation’s not solved: generalizability versus optimizability in clinical natural language processing. PLoS One 9(11), e112774 (2014)
Apostolova, E., Tomuro, N., Demner-Fushman, D.: Automatic extraction of lexico-syntactic patterns for detection of negation and speculation scopes. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers-Volume 2, pp. 283–287. Association for Computational Linguistics (2011)
Zou, B., Zhou, G., Zhu, Q.: Tree kernel-based negation and speculation scope detection with structured syntactic parse features. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 968–976 (2013)
Torralba, A., Efros, A.A.: Unbiased look at dataset bias (2011)
de Bruijn, B., Cherry, C., Kiritchenko, S., Martin, J., Zhu, X.: NRC at i2b2: one challenge, three practical tasks, nine statistical systems, hundreds of clinical records, millions of useful features
Clark, C., et al.: Determining assertion status for medical problems in clinical records
Demner-Fushman, D., Apostolova, E., Islamaj Dogan, R., et al.: NLM’s system description for the fourth i2b2/va challenge. In: Proceedings of the 2010 i2b2/VA Workshop on Challenges in Natural Language Processing for Clinical Data, Boston, MA, USA: i2b2 (2010)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Zhou, Z.H.: A brief introduction to weakly supervised learning. Natl. Sci. Rev. 5(1), 44–53 (2017)
Olivier Chapelle, B.S., Zien, A.: Semi-Supervised Learning. Adaptive Computation and Machine Learning Series. MIT Press, Cambridge (2010)
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Settles, B.: Active learning. Synth. Lect. Artif. Intell. Mach. Learn. 6(1), 1–114 (2012)
Hanneke, S., et al.: Theory of disagreement-based active learning. Found. Trends® Mach. Learn. 7(2–3), 131–309 (2014)
Zhang, C., Chaudhuri, K.: Beyond disagreement-based agnostic active learning. In: Advances in Neural Information Processing Systems, pp. 442–450 (2014)
Jiang, L., Zhou, Z., Leung, T., Li, L.J., Fei-Fei, L.: MentorNet: learning data-driven curriculum for very deep neural networks on corrupted labels. arXiv preprint arXiv:1712.05055 (2017)
Kumar, M.P., Packer, B., Koller, D.: Self-paced learning for latent variable models. In: Advances in Neural Information Processing Systems, pp. 1189–1197 (2010)
Jiang, L., Meng, D., Zhao, Q., Shan, S., Hauptmann, A.G.: Self-paced curriculum learning. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. CoRR abs/1607.04606 (2016). http://arxiv.org/abs/1607.04606
Peters, M.E., et al.: Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018)
Chelba, C., et al.: One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint arXiv:1312.3005 (2013)
Vaswani, A., et al.: Attention is all you need. CoRR abs/1706.03762 (2017). http://arxiv.org/abs/1706.03762
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014). http://arxiv.org/abs/1412.6980
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
Varma, S., Simon, R.: Bias in error estimation when using cross-validation for model selection. BMC Bioinform. 7(1), 91 (2006)
Sigurd, B., Eeg-Olofsson, M., Van De Weijer, J.: Word length, sentence length and frequency - Zipf revisited. Studia Linguistica 58(1), 37–52 (2004). https://doi.org/10.1111/j.0039-3193.2004.00109.x
Ulyanov, D., Vedaldi, A., Lempitsky, V.: Deep image prior. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (2018)
Acknowledgments
The authors would like to acknowledge Artem Shelmanov and Ilya Sochenkov for sharing their expertise in natural language processing, mentorship and support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Liventsev, V., Fedulova, I., Dylov, D. (2019). Deep Text Prior: Weakly Supervised Learning for Assertion Classification. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions. ICANN 2019. Lecture Notes in Computer Science(), vol 11731. Springer, Cham. https://doi.org/10.1007/978-3-030-30493-5_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-30493-5_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30492-8
Online ISBN: 978-3-030-30493-5
eBook Packages: Computer ScienceComputer Science (R0)