Skip to main content

Section Identification to Improve Information Extraction from Chinese Medical Literature

  • Conference paper
  • First Online:
Smart Health (ICSH 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10983))

Included in the following conference series:

  • 1169 Accesses

Abstract

The Chinese medical literature contains a large amount of knowledge. Reducing the effort needed by medical scholars to extract this knowledge requires a literature analysis to identify the key information in each paper. We argue that identifying the sections of a paper would help us filter noise from the paper and increase the accuracy of extracting the experimental findings. In this research in progress, we consider paper section identification as a sentence classification task and apply Conditional Random Fields (CRFs) to tackle the problem. In our model we combine both lexical and structural features to facilitate section identification. Experiments on a human-curated asthma dataset show that our approach achieves a 10%–20% performance improvement over Support Vector Machines (SVMs), and that use of both bag-of-words features and domain lexicons benefit the task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Li, X., Tong, Y., Wang, W.: MedC: a literature analysis system for chinese medicine research. In: Zheng, X., Zeng, D.D., Chen, H., Leischow, S.J. (eds.) ICSH 2015. LNCS, vol. 9545, pp. 311–320. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-29175-8_29

    Chapter  Google Scholar 

  2. Ito, T., Shimbo, M., Yamasaki, T., Matsumoto, Y.: Semi-supervised sentence classification for MEDLINE documents. Methods 138, 141–146 (2004)

    Google Scholar 

  3. Zhao, J., Liu, K., Wang, G.: Adding redundant features for CRFs-based sentence sentiment classification. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 117–126. Association for Computational Linguistics (2008)

    Google Scholar 

  4. Naughton, M., Stokes, N., Carthy, J.: Sentence-level event classification in unstructured texts. Inf. Retr. 13, 132–156 (2010). https://doi.org/10.1007/s10791-009-9113-0

    Article  Google Scholar 

  5. Kim, S.N., Martinez, D., Cavedon, L.: Automatic classification of sentences for evidence based medicine. In: Proceedings of the ACM Fourth International Workshop on Data and Text Mining in Biomedical Informatics, pp. 13–22 (2010)

    Google Scholar 

  6. Lui, M.: Feature stacking for sentence classification in evidence-based medicine. In: Proceedings of the Australasian Language Technology Association Workshop 2012, pp. 134–138 (2012)

    Google Scholar 

  7. Angrosh, M.A., Cranefield, S., Stanger, N.: Context identification of sentences in related work sections using a conditional random field: towards intelligent digital libraries. In: Proceedings of the 10th Annual Joint Conference on Digital Libraries, pp. 293–302. ACM (2010)

    Google Scholar 

  8. Hachey, B., Grover, C.: Sequence modelling for sentence classification in a legal summarisation system. In: Proceedings of the 2005 ACM Symposium on Applied Computing, pp. 292–296 (2005)

    Google Scholar 

  9. Kim, Y.: Convolutional neural networks for sentence classification (2014)

    Google Scholar 

  10. Chung, G.Y.: Sentence retrieval for abstracts of randomized controlled trials. BMC Med. Inform. Decis. Mak. 9, 1–13 (2009). https://doi.org/10.1186/1472-6947-9-10

    Article  MathSciNet  Google Scholar 

  11. Demner-Fushman, D., Lin, J.: Answering clinical questions with knowledge-based and statistical techniques. Comput. Linguist. 33, 63–103 (2007)

    Article  Google Scholar 

  12. Sutton, C., McCallum, A.: An Introduction to Conditional Random Fields for Relational Learning. In: Introduction to statistical relational learning. MIT Press (2006)

    Google Scholar 

  13. McKnight, L., Srinivasan, P.: Categorization of sentence types in medical abstracts. In: AMIA Annual Symposium Proceedings, pp. 440–444. American Medical Informatics Association (2003)

    Google Scholar 

  14. Yamamoto, Y., Takagi, T.: A sentence classification system for multi biomedical literature summarization. In: Proceedings of the 21st International Conference on Data Engineering, pp. 1163–1168 (2005)

    Google Scholar 

Download references

Acknowledgements

The research is partially supported by Digital Innovation Lab at City University of Hong Kong, GuangDong Science and Technology Project 2014A020221090, and the City University of Hong Kong Shenzhen Research Institute.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sijia Zhou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhou, S., Li, X. (2018). Section Identification to Improve Information Extraction from Chinese Medical Literature. In: Chen, H., Fang, Q., Zeng, D., Wu, J. (eds) Smart Health. ICSH 2018. Lecture Notes in Computer Science(), vol 10983. Springer, Cham. https://doi.org/10.1007/978-3-030-03649-2_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-03649-2_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-03648-5

  • Online ISBN: 978-3-030-03649-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics