Skip to main content

An Efficient Statistical Approach for Automatic Organic Chemistry Summarization

  • Conference paper
Book cover Advances in Natural Language Processing (GoTAL 2008)

Abstract

In this paper, we propose an efficient strategy for summarizing scientific documents in Organic Chemistry that concentrates on numerical treatments. We present its implementation named yachs (Yet Another Chemistry Summarizer) that combines a specific document pre-processing with a sentence scoring method relying on the statistical properties of documents. We show that yachs achieves the best results among several other summarizers on a corpus made of Organic Chemistry articles.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Mani, I., Maybury, M.T.: Advances in Automatic Text Summarization. MIT Press, Cambridge (1999)

    Google Scholar 

  2. Luhn, H.P.: The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development 2(2), 159 (1958)

    Article  MathSciNet  Google Scholar 

  3. Climenson, W.D., Hardwick, N.H., Jacobson, S.N.: Automatic Syntax Analysis in Machine Indexing and Abstracting. American Documentation 12(3), 178–183 (1961)

    Article  Google Scholar 

  4. Edmundson, H.P.: New Methods in Automatic Extracting. Journal of the ACM (JACM) 16(2), 264–285 (1969)

    Article  MATH  Google Scholar 

  5. Pollock, J.J., Zamora, A.: Automatic Abstracting Research at Chemical Abstracts Service. Journal of Chemical Information and Computer Sciences 15(4), 226–232 (1975)

    Google Scholar 

  6. Kupiec, J., Pedersen, J., Chen, F.: A Trainable Document Summarizer. In: 18th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 68–73. ACM Press, New York (1995)

    Google Scholar 

  7. Mani, I., Bloedorn, E.: Machine Learning of Generic and User-focused Summarization. In: 15th National Conference on Artificial intelligence (AAAI), pp. 820–826. AAAI Press, Menlo Park (1998)

    Google Scholar 

  8. Teufel, S., Moens, M.: Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status. Computational Linguistics 28(4), 409–445 (2002)

    Article  Google Scholar 

  9. Reeve, L.H., Han, H., Brooks, A.D.: The use of Domain-Specific Concepts in Biomedical Text Summarization. Information Processing and Management 43(6), 1765–1776 (2007)

    Article  Google Scholar 

  10. Salton, G., Wong, A., Yang, C.S.: A Vector Space Model for Automatic Indexing. Communications of the ACM 18(11), 613–620 (1975)

    Article  MATH  Google Scholar 

  11. Porter, M.F.: An Algorithm for Suffix Stripping. Program 14, 130–137 (1980)

    Article  Google Scholar 

  12. Boudin, F., Torres-Moreno, J.M.: Mixing Statistical and Symbolic Approaches for Chemical Names Recognition. In: Gelbukh, A. (ed.) CICLing 2008. LNCS, vol. 4919, pp. 334–349. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  13. Winkler, W.E.: The State of Record Linkage and Current Research Problems. Statistics of Income Division 4, 73–79 (1999)

    Google Scholar 

  14. Torres-Moreno, J.M., Velazquez-Morales, P., Meunier, J.G.: Condensés de textes par des méthodes numériques. In: Journées internationales d’Analyse statistique des Données Textuelles (JADT), vol. 2, pp. 723–734 (2002)

    Google Scholar 

  15. Spärck Jones, K., Galliers, J.R.: Evaluating Natural Language Processing Systems: An Analysis and Review. Springer, Heidelberg (1996)

    Google Scholar 

  16. Lin, C.Y.: Rouge: A Package for Automatic Evaluation of Summaries. In: Workshop on Text Summarization Branches Out, pp. 25–26 (2004)

    Google Scholar 

  17. Dang, H.T.: Overview of DUC 2005. In: Document Understanding Conference (DUC) (2005)

    Google Scholar 

  18. Radev, D.R., Blair-Goldensohn, S., Zhang, Z.: Experiments in Single and Multi-Document Summarization Using MEAD. In: Document Understanding Conference (DUC) (2001)

    Google Scholar 

  19. Yatsko, V.A., Vishnyakov, T.N.: A Method for Evaluating Modern Systems of Automatic Text Summarization. Automatic Documentation and Mathematical Linguistics 41(3), 93–103 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Boudin, F., Torres-Moreno, JM., Velázquez-Morales, P. (2008). An Efficient Statistical Approach for Automatic Organic Chemistry Summarization. In: Nordström, B., Ranta, A. (eds) Advances in Natural Language Processing. GoTAL 2008. Lecture Notes in Computer Science(), vol 5221. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85287-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85287-2_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85286-5

  • Online ISBN: 978-3-540-85287-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics