Skip to main content

Identifying Finding Sentences in Conclusion Subsections of Biomedical Abstracts

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11420))

Abstract

Segmenting scientific abstracts and full-text based on their rhetorical function is an essential task in text classification. Small rhetorical segments can be useful for fine-grained literature search, summarization, and comparison. Current effort has been focusing on segmenting documents into general sections such as introduction, method, and conclusion, and much less on the roles of individual sentences within the segments. For example, not all sentences in the conclusion section are describing research findings. In this work, we developed rule-based and machine learning methods and compared their performance in identifying the finding sentences in conclusion subsections of biomedical abstracts. 1100 conclusion subsections with observational and randomized clinical trials study designs covering five common health topics were sampled from PubMed to develop and evaluate the methods. The rule-based method and the bag-of-words based machine learning method both achieved high accuracy. The better performance by the simple rule-based approach shows that although advanced machine learning approaches could capture the main patterns, human expert may still outperform on such a specialized task.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Search strategies: Study Type public health: Search strategies by study type. http://libguides.adelaide.edu.au/c.php?g=165091p=5799888. Accessed 2 Jan 2018

  2. Agarwal, S., Yu, H.: Automatically classifying sentences in full-text biomedical articles into introduction, methods, results and discussion. Bioinformatics 25(23), 3174–3180 (2009)

    Article  Google Scholar 

  3. Asghar, M.Z., Khan, A., Ahmad, S., Qasim, M., Khan, I.A.: Lexicon-enhanced sentiment analysis framework using rule-based classification scheme. PLoS ONE 12(2), e0171649 (2017)

    Article  Google Scholar 

  4. Asghar, M.Z., Khan, A., Bibi, A., Kundi, F.M., Ahmad, H.: Sentence-level emotion detection framework using rule-based classification. Cogn. Comput. 9(6), 868–894 (2017)

    Article  Google Scholar 

  5. Chapman, W.W., Bridewell, W., Hanbury, P., Cooper, G.F., Buchanan, B.G.: A simple algorithm for identifying negated findings and diseases in discharge summaries. J. Biomed. Inform. 34(5), 301–310 (2001)

    Article  Google Scholar 

  6. Chiu, K., Grundy, Q., Bero, L.: Spin in published biomedical literature: a methodological systematic review. PLoS Biol. 15(9), e2002173 (2017)

    Article  Google Scholar 

  7. Chung, G.Y.: Sentence retrieval for abstracts of randomized controlled trials. BMC Med. Inf. Decis. Making 9(1), 10 (2009)

    Article  Google Scholar 

  8. Cofield, S.S., Corona, R.V., Allison, D.B.: Use of causal language in observational studies of obesity and nutrition. Obes. Facts 3(6), 353–356 (2010)

    Article  Google Scholar 

  9. Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20(1), 37–46 (1960)

    Article  Google Scholar 

  10. Friedman, C., Alderson, P.O., Austin, J.H., Cimino, J.J., Johnson, S.B.: A general natural-language text processor for clinical radiology. J. Am. Med. Inform. Assoc. 1(2), 161–174 (1994)

    Article  Google Scholar 

  11. Gabb, H.A., Lucic, A., Blake, C.: A method to automatically identify the results from journal articles. In: iConference 2015 Proceedings (2015)

    Google Scholar 

  12. Guo, Y., Korhonen, A., Liakata, M., Karolinska, I.S., Sun, L., Stenius, U.: Identifying the information structure of scientific abstracts: an investigation of three different schemes. In: Proceedings of the 2010 Workshop on Biomedical Natural Language Processing, pp. 99–107. Association for Computational Linguistics (2010)

    Google Scholar 

  13. Hirohata, K., Okazaki, N., Ananiadou, S., Ishizuka, M.: Identifying sections in scientific abstracts using conditional random fields. In: Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I (2008)

    Google Scholar 

  14. Kilicoglu, H., Rosemblat, G., Malički, M., ter Riet, G.: Automatic recognition of self-acknowledged limitations in clinical research literature. J. Am. Med. Inform. Assoc. 25(7), 855–861 (2018)

    Article  Google Scholar 

  15. Kim, S.N., Martinez, D., Cavedon, L., Yencken, L.: Automatic classification of sentences to support evidence based medicine. BMC Bioinf. 12, S5 (2011). BioMed Central

    Article  Google Scholar 

  16. Kundi, F.M., Khan, A., Ahmad, S., Asghar, M.Z.: Lexicon-based sentiment analysis in the social web. J. Basic Appl. Sci. Res. 4(6), 238–248 (2014)

    Google Scholar 

  17. Liakata, M., Teufel, S., Siddharthan, A., Batchelor, C.R., et al.: Corpora for the conceptualisation and zoning of scientific papers. In: LREC. Citeseer (2010)

    Google Scholar 

  18. Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D.: The stanford coreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014)

    Google Scholar 

  19. McHugh, M.L.: Interrater reliability: the kappa statistic. Biochemia medica: Biochemia medica 22(3), 276–282 (2012)

    Article  MathSciNet  Google Scholar 

  20. McKnight, L., Srinivasan, P.: Categorization of sentence types in medical abstracts. In: AMIA Annual Symposium Proceedings, vol. 2003, p. 440. American Medical Informatics Association (2003)

    Google Scholar 

  21. Mizuta, Y., Korhonen, A., Mullen, T., Collier, N.: Zone analysis in biology articles as a basis for information extraction. Int. J. Med. Inf. 75(6), 468–487 (2006)

    Article  Google Scholar 

  22. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  23. Ruch, P., et al.: Using argumentation to extract key sentences from biomedical abstracts. Int. J. Med. Inf. 76(2–3), 195–200 (2007)

    Article  Google Scholar 

  24. Teufel, S., Moens, M.: Summarizing scientific articles: experiments with relevance and rhetorical status. Comput. Linguist. 28(4), 409–445 (2002)

    Article  Google Scholar 

  25. Teufel, S., Siddharthan, A., Batchelor, C.: Towards discipline-independent argumentative zoning: evidence from chemistry and computational linguistics. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 3, pp. 1493–1502. Association for Computational Linguistics (2009)

    Google Scholar 

  26. Yu, H., Hripcsak, G., Friedman, C.: Mapping abbreviations to full forms in biomedical articles. J. Am. Med. Inform. Assoc. 9(3), 262–272 (2002)

    Article  Google Scholar 

Download references

Acknowledgement

We would like to thank Shiqi Qu who have contributed to the inter-coder agreement checking and corpus construction.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bei Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Y., Yu, B. (2019). Identifying Finding Sentences in Conclusion Subsections of Biomedical Abstracts. In: Taylor, N., Christian-Lamb, C., Martin, M., Nardi, B. (eds) Information in Contemporary Society. iConference 2019. Lecture Notes in Computer Science(), vol 11420. Springer, Cham. https://doi.org/10.1007/978-3-030-15742-5_64

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-15742-5_64

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-15741-8

  • Online ISBN: 978-3-030-15742-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics