On the Contribution of Specific Entity Detection in Comparative Constructions to Automatic Spin Detection in Biomedical Scientific Publications

Koroleva, Anna; Paroubek, Patrick

doi:10.1007/978-3-030-66527-2_22

On the Contribution of Specific Entity Detection in Comparative Constructions to Automatic Spin Detection in Biomedical Scientific Publications

Conference paper
First Online: 31 December 2020

265 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12598))

Abstract

In this article, we address the problem of providing automated aid for the detection of misrepresentation (“spin”) of research results in scientific publications from the biomedical domain. Our goal is to identify automatically inadequate claims in medical articles, i.e. claims that present the beneficial effect of the experimental treatment to be greater than it is actually proven by the research results. To this end, we propose a Natural Language Processing (NLP) approach. We first make a review of related work and an NLP analysis of the problem; then we present our first results obtained on the articles that report results of Randomized Controlled Trials (RCTs), i.e. clinical trials comparing two or more interventions by randomly assigning them to patients. Our first experiments concern the identification of entities specific to RCTs (outcomes and patient groups), obtained with basic methods (local grammars) on a corpus extracted from the PubMed open archive. We explore the possibility to extract outcomes from comparative constructions that are commonly used to report results of clinical trials. Our second set of experiments consists in extracting outcomes from a manually annotated corpus using deep learning methods.

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 676207.

At the time of the reported work, Anna Koroleva was a PhD student at LIMSI-CNRS in Orsay, France and at the Academic Medical Center, University of Amsterdam in Amsterdam, the Netherlands.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
The term find its origins in the term “spin doctors”, communication agents of public personalities particularly deft at improving the image of their clients.
2.
Cochrane is an independent international network of researchers, health professionals and patients whose aim is to improve decision making in health care (http://www.cochrane.org).
3.
A systematic review is a type of scientific articles aimed at an exhaustive summary of the literature about a particular problem with statistical evaluation of the results.
4.
UMLS (Unified Medical Language System) is a compendium of several medical controlled vocabularies, https://www.nlm.nih.gov/research/um.
5.
https://metamap.nlm.nih.gov/.
6.
https://www.ncbi.nlm.nih.gov/pmc/.
7.
https://www.ncbi.nlm.nih.gov/pmc/.
8.
https://github.com/google-research/bert.

References

Ballard, B.W.: A general computational treatment of comparatives for natural language question answering. In: Proceedings of the 26th Annual Meeting of the Association for Computational Linguistics, pp. 41–48. Association for Computational Linguistics, Buffalo (1988). https://doi.org/10.3115/982023.982029, http://www.aclweb.org/anthology/P88-1006
Beltagy, I., Cohan, A., Lo, K.: Scibert: Pretrained contextualized embeddings for scientific text. arXiv preprint arXiv:1903.10676 (2019)
Boutron, I., Altman, D., Hopewell, S., Vera-Badillo, F., Tannock, I., Ravaud, P.: Impact of spin in the abstracts of articles reporting results of randomized controlled trials in the field of cancer: the SPIIN randomized controlled trial. Journal of Clinical Oncology (2014)
Google Scholar
Boutron, I., Dutton, S., Ravaud, P., Altman, D.: Reporting and interpretation of randomized controlled trials with statistically nonsignificant results for primary outcomes. JAMA 303, 2058–2064 (2010)
Article Google Scholar
Bruijn, B.D., Carini, S., Kiritchenko, S., Martin, J., Sim, I.: Automated information extraction of key trial design elements from clinical trial publications. In: Proceedings of the AMIA Annual Symposium (2008)
Google Scholar
Chung, G.Y.C.: Towards identifying intervention arms in randomized controlled trials: Extracting coordinating constructions. J. Biomed. Inf. 42(5), 790–800 (2009). https://doi.org/10.1016/j.jbi.2008.12.011. http://www.sciencedirect.com/science/article/pii/S1532046408001573
Article Google Scholar
Dawes, M., Pluye, P., Shea, L., Grad, R., Greenberg, A., Nie, J.Y.: The identification of clinically important elements within medical journal abstracts: Patient-population-problem, exposure-intervention, comparison, outcome, duration and results (PECODR). J. Innov. Health Inf. 15(1), 9–16 (2007). https://doi.org/10.14236/jhi.v15i1.640. https://hijournal.bcs.org/index.php/jhi/article/view/640
Article Google Scholar
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018). http://arxiv.org/abs/1810.04805
Friedman, C.: A general computational treatment of the comparative. In: 27th Annual Meeting of the Association for Computational Linguistics (1989). http://aclanthology.coli.uni-saarland.de/pdf/P/P89/P89-1020.pdf
Ganapathibhotla, M., Liu, B.: Mining opinions in comparative sentences. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pp. 241–248. Coling 2008 Organizing Committee (2008). http://aclanthology.coli.uni-saarland.de/pdf/C/C08/C08-1031.pdf
Gupta, S., Mahmood, A.S.M.A., Ross, K.E., Wu, C.H., Vijay-Shanker, K.: Identifying comparative structures in biomedical text. In: Proceedings of the BioNLP 2017 Workshop, pp. 206–215 (2017)
Google Scholar
Hatzivassiloglou, V., Wiebe, J.M.: Effects of adjective orientation and gradability on sentence subjectivity. In: COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics (2000). http://www.aclweb.org/anthology/C00-1044
Higgins, J.P., Green, S. (eds.): Cochrane Handbook for Systematic Reviews of Interventions. Wiley, West Sussex (2008)
Google Scholar
Higgins, J.P.T., et al.: The Cochrane collaboration’s tool for assessing risk of bias in randomised trials. BMJ 343, d5928 (2011). https://doi.org/10.1136/bmj.d5928. https://www.bmj.com/content/343/bmj.d5928
Article Google Scholar
Kiritchenko, S., Bruijn, B.D., Carini, S., Martin, J., Sim, I.: Exact: automatic extraction of clinical trial characteristics from journal publications. BMC Med. Inf. Decis. Mak. 10, 56 (2010). https://doi.org/10.1186/1472-6947-10-56
Article Google Scholar
Koroleva, A., Kamath, S., Paroubek, P.: Extracting outcomes from articles reporting randomized controlled trials using pre-trained deep language representations. EasyChair Preprint no. 2940 (EasyChair, 2020)
Google Scholar
Koroleva, A., Paroubek, P.: Demonstrating construkt, a text annotation toolkit for generalized linguistic contructions applied to communication spin. In: The 9th Language and Technology Conference (LTC 2019) Demo Session (2019)
Google Scholar
Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C.H., Kang, J.: Biobert: a pre-trained biomedical language representation model for biomedical text mining. arXiv preprint arXiv:1901.08746 (2019)
Li, S., Lin, C.Y., Song, Y.I., Li, Z.: Comparable entity mining from comparative questions. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 650–658. Association for Computational Linguistics, Uppsala, Sweden, July 2010. http://www.aclweb.org/anthology/P10-1067
Marshall, I.J., Kuiper, J., Wallace, B.C.: Robotreviewer: evaluation of a system for automatically assessing bias in clinical trials. J. Am. Med. Inf. Assoc. JAMIA 23, 193–201 (2015). https://doi.org/10.1093/jamia/ocv044
Article Google Scholar
Nguyen, N., Miwa, M., Tsuruoka, Y., Tojo, S.: Open information extraction from biomedical literature using predicate-argument structure patterns. In: Proceedings of the 5th International Symposium on Languages in Biology and Medicine, pp. 51–55, December 2013
Google Scholar
Olawsky, D.E.: The lexical semantics of comparative expressions in a multi-level semantic processor. In: 27th Annual Meeting of the Association for Computational Linguistics (1989). http://aclanthology.coli.uni-saarland.de/pdf/P/P89/P89-1021.pdf
Paumier, S.: Unitex 3.1 user manual (2016). http://unitexgramlab.org/releases/3.1/man/ Unitex- GramLab-3.1-usermanual-en.pdf
Peters, M., et al.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long Papers) (2018). https://doi.org/10.18653/v1/n18-1202
Ryan, K.: Corepresentational grammar and parsing English comparatives. In: Proceedings of the 19th Annual Meeting of the Association for Computational Linguistics, pp. 13–18. Association for Computational Linguistics, Stanford, California, USA, June 1981. https://doi.org/10.3115/981923.981927, http://www.aclweb.org/anthology/P81-1003
Summerscales, R., Argamon, S., Hupert, J., Schwartz, A.: Identifying treatments, groups, and outcomes in medical abstracts. In: Proceedings of the Sixth Midwest Computational Linguistics Colloquium (MCLC) (2009)
Google Scholar
Summerscales, R.L., Argamon, S.E., Bai, S., Hupert, J., Schwartz, A.: Automatic summarization of results from clinical trials. In: 2011 IEEE International Conference on Bioinformatics and Biomedicine, pp. 372–377 (2011)
Google Scholar
Wallace, B.C., Kuiper, J., Sharma, A., Zhu, M., Marshall, I.J.: Extracting PICO sentences from clinical trial reports using supervised distant supervision. J. Mach. Learn. Res. 17(1), 4572–4596 (2016). http://dl.acm.org/citation.cfm?id=2946645.3007085
MathSciNet Google Scholar
Xu, R., Garten, Y., Supekar, K., Das, A., Altman, R., Garber, A.: Extracting subject demographic information from abstracts of randomized clinical trial reports. Stud. Health Technol. Inf. 129, 550–4 (2007). https://doi.org/10.3233/978-1-58603-774-1-550
Article Google Scholar
Yang, S., Ko, Y.: Extracting comparative entities and predicates from texts using comparative type classification. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 1636–1644. Association for Computational Linguistics (2011). http://aclanthology.coli.uni-saarland.de/pdf/P/P11/P11-1164.pdf
Yavchitz, A., et al.: A new classification of spin in systematic reviews and meta-analyses was developed and ranked according to the severity. J. Clin. Epidemiol 75, 56–65 (2016)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Applied Simulation, School of Life Sciences and Facility Management, Zurich University of Applied Sciences (ZHAW), 8820, Waedenswil, Switzerland
Anna Koroleva
Swiss Institute of Bioinformatics (SIB), 1015, Lausanne, Switzerland
Anna Koroleva
LIMSI, CNRS, Université Paris-Saclay, 91405, Orsay, France
Patrick Paroubek

Authors

Anna Koroleva
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Paroubek
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anna Koroleva .

Editor information

Editors and Affiliations

Adam Mickiewicz University, Poznań, Poland
Zygmunt Vetulani
Laboratoire d’Informatique pour la Méca, Orsay, France
Patrick Paroubek
Adam Mickiewicz University, Poznań, Poland
Marek Kubis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Koroleva, A., Paroubek, P. (2020). On the Contribution of Specific Entity Detection in Comparative Constructions to Automatic Spin Detection in Biomedical Scientific Publications. In: Vetulani, Z., Paroubek, P., Kubis, M. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2017. Lecture Notes in Computer Science(), vol 12598. Springer, Cham. https://doi.org/10.1007/978-3-030-66527-2_22

Download citation

DOI: https://doi.org/10.1007/978-3-030-66527-2_22
Published: 31 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66526-5
Online ISBN: 978-3-030-66527-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics