Skip to main content

Multi-document Summarization Using Informative Words and Its Evaluation with a QA System

  • Conference paper
  • First Online:
Computational Linguistics and Intelligent Text Processing (CICLing 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2276))

  • 1609 Accesses

Abstract

To reduce both the text size and the information loss during summarization, a multi-document summarization system using informative words is proposed. The procedure to extract informative words from multiple documents and generate summaries is described in this paper. At first, a smallscale experiment with 12 events and 60 questions was made. The results are evaluated by human assessors and a question answering (QA) system respectively. This QA system will help to prevent from drawbacks of human assessors. They show good performance of informative words. That encourages large-scale evaluation. An experiment is further conducted, which contains in total 140 questions out of 17,877 documents. Amongst these documents, 3,146 events were identified. The experimental results have also shown that the models using informative words outperform pure heuristic voting-only strategy when the metric of relative precision rate is used.

The MUs that were reported by more than reporters were selected.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Chen, H.H.: The Contextual Analysis of Chinese Sentences with Punctuation Marks. Literal and Linguistic Computing, Oxford University Press, 9(4) (1994) 281–289

    Article  Google Scholar 

  2. Chen, H.H. and Huang, S.J.: A Summarization System for Chinese News from Multiple Sources. Proceeding of 4th International Workshop on Information Retrieval with Asia Language (1999) 1–7.

    Google Scholar 

  3. Chen, H.H. and Lin, C.J.: A Multilingual News Summarizer. Proceeding of 18th International Conference on Computational Linguistics, (2000) 159–165.

    Google Scholar 

  4. Edmundson, H.P.: Problems in Automatic Extracting. Communications of the ACM, 7, (1964) 259–263.

    Article  Google Scholar 

  5. Edmundson, H.P.: New Methods in Automatic Extracting. Journal of the ACM, 16, (1969) 264–285.

    Article  MATH  Google Scholar 

  6. Firmin Hand, T. and B. Sundheim (eds): TIPSTER-SUMMAC Summarization Evaluation. Proceedings of the TIPSTER Text Phase III Workshop, Washington. (1998)

    Google Scholar 

  7. Fukumoto, F. and Suzuki, Y.: Event Tracking based on Domain Dependency. Proceedings of SIGIR 2000 (2000) 57–64.

    Google Scholar 

  8. Goldstein, J., Kantrowitz, M., Mittal, V. and Carbonell, J.: Summarizing Text Documents: Sentences Selection and Evaluation Metrics. Proceedings of SIGIR 1999 (1999) 121–128.

    Google Scholar 

  9. Hovy, E. and Marcu, D.: Automated Text Summarization. Tutorial in COLING/ACL98 (1998)

    Google Scholar 

  10. Lin, C.J. and Chen, H.H.: Description of Preliminary Results to TREC-8 QA Task. Proceedings of The Eighth Text Retrieval Conference (1999) 363–368.

    Google Scholar 

  11. Lin, C.J., Chen, H.H., Liu, C.J., Tsai, C.H. and Wung, H.C.: Open Domain Question Answering on Heterogeneous Data. Proceedings of ACL Workshop on Human Language Technology and Knowledge Management, July 6–7 2001, Toulouse France, (2001) 79–85.

    Google Scholar 

  12. Lin, C.Y. and Hovy E.: NEATS: A Multidocument Summarizer. Workshop of DUC 2001 (2001) [on-line] Available: http://www-nlpir.nist.gov/projects/duc/duc2001/agenda_duc2001.html

  13. Mani, I. and Bloedorn, E.: Multi-document Summarization by Graph Search and Matching. Proceedings of the 10th National Conference on Artificial Intelligence, Providence, RI, (1997) 623–628.

    Google Scholar 

  14. Mani, I. et al.: The TIPSPER SUMMAC Text Summarization Evaluation: Final Report, Technique Report. Automatic Text Summarization Conference, (1998)

    Google Scholar 

  15. Radev, D.R. and McKeown, K.R.: Generating Natural Language Summaries from Multiple On-Line Sources. Computational Linguistics, Vol. 24 No. 3 (1998) 469–500.

    Google Scholar 

  16. Radev, D.R., Blair-Goldensohn and Zhang, Z.: Experiment in Single and Multi-Document Summarization Using MEAD. Workshop of DUC 2001 (2001) [on-line] Available: http://www-nlpir.nist.gov/projects/duc/duc2001/agenda_duc2001.html

  17. Regina Barzilay and Michael Elhada: Using Lexical Chains for Text Summarization. Proceedings of The Intelligent Scalable Text Summarization Workshop, ACL/EACL (1997) 10–17.

    Google Scholar 

  18. Tsutomo, H., Sasaki, T. and Isozaki H.: An Extrinsic Evaluation for Question-Biased Text Summarization on QA Tasks. Proceedings of workshop on Automatic Summarization (2001) 61–68.

    Google Scholar 

  19. Voorhees: QA Track Overview (TREC) 9, (2000) [on-line] Available: http://trec.nist.gov/presentations/TREC9/qa/index.htm

  20. Fellbaum, C.: WordNet. The MIT Press, Cambridge Masschusettes (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

June-Jei, K., Hung-Chia, W., Chuan-Jie, L., Hsin-Hsi, C. (2002). Multi-document Summarization Using Informative Words and Its Evaluation with a QA System. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2002. Lecture Notes in Computer Science, vol 2276. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45715-1_41

Download citation

  • DOI: https://doi.org/10.1007/3-540-45715-1_41

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43219-7

  • Online ISBN: 978-3-540-45715-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics