Skip to main content

Chinese Text Summarization Using a Trainable Summarizer and Latent Semantic Analysis

  • Conference paper
  • First Online:
Digital Libraries: People, Knowledge, and Technology (ICADL 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2555))

Included in the following conference series:

Abstract

In this paper, two novel approaches are proposed to extract important sentences from a document to create its summary. The first is a corpus-based approach using feature analysis. It brings up three new ideas: 1) to employ ranked position to emphasize the significance of sentence position, 2) to reshape word unit to achieve higher accuracy of keyword importance, and 3) to train a score function by the genetic algorithm for obtaining a suitable combination of feature weights. The second approach combines the ideas of latent semantic analysis and text relationship maps to interpret conceptual structures of a document. Both approaches are applied to Chinese text summarization. The two approaches were evaluated by using a data corpus composed of 100 articles about politics from New Taiwan Weekly, and when the compression ratio was 30%, average recalls of 52.0% and 45.6% were achieved respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aone, C., Okurowski, M.E., Gorlinsky, J., Larsen, B.: A Trainable Summarizer with Knowledge Acquired from Robust NLP Techniques. In: Mani, I., Maybury, M. (eds.): Advances in Automated Text Summarization. MIT Press (1999) 71–80

    Google Scholar 

  2. Azzam, S., Humphreys, K., Gaizauskas, R.: Using Coreference Chains for Text Summarization. Processing of the ACL’99 Workshop on Coreference and its Applications. ACL, Baltimore (1999)

    Google Scholar 

  3. Barzilay, R., Elhadad, M.: Using Lexical Chains for Text Summarization. Processing of the Workshop on Intelligent Scalable Text Summarization. (1997)

    Google Scholar 

  4. Bellegarda, J.R., Butzberger, J.W., Chow, Y.L.: A Novel Word Clustering Algorithm Based on Latent Semantic Analysis. Conference on Acoustics, Speech, and Signal Processing, Vol. 1. IEEE (1996) 172–175

    Google Scholar 

  5. Edmundson, H.P.: New Methods in Automatic Extracting. In: Mani, I., Maybury, M. (eds.): Advances in Automated Text Summarization. MIT Press (1999) 23–42

    Google Scholar 

  6. Gong, Y., Liu, X.: Generic Text Summarization Using Relevance Measure and Latent Semantic Analysis. SIGIR. ACM, New Orleans Louisiana (2001)

    Google Scholar 

  7. Habn, U., Mani, I.: The Challenge of Automatic Summarization. Computer, Vol. 33, No. 2000. IEEE (2000) 29–36

    Google Scholar 

  8. Han, J., Kember, M.: In Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers (2001)

    Google Scholar 

  9. Hovy, E., Lin, C.Y.: Automated Text Summarization in SUMMARIST. In: Mani, I., Maybury, M. (eds.): Advances in Automated Text Summarization. MIT Press (1999) 81–94

    Google Scholar 

  10. Kim, J.H., Kim, J.H., Hwang, D.: Korean Text Summarization Using an Aggregative Similarity. Processing of the 5th International Workshop on Information Retrieval with Asian Languages. ACM (2000)

    Google Scholar 

  11. Kowalski, G. (ed.): Information Retrieval Systems: Theory and Implementation. Kluwer Academic Publishers (1997)

    Google Scholar 

  12. Kupiec, J., Pedersen, J., Chen, F.: A Trainable Document Summarizer. SIGIR. ACM, Seattle Washington (1995)

    Google Scholar 

  13. Landauer, T.K., Foltz, P.W., Laham, D.: An Introduction to Latent Semantic Analysis. Discourse Processes, Vol. 25. (1998) 259–284

    Article  Google Scholar 

  14. Lin, C.Y.: Training a Selection Function for Extraction. CIKM. ACM, Kansas City (1999)

    Google Scholar 

  15. Mani, I., Maybury, M. (eds.): Advances in Automated Text Summarization. MIT Press (1999)

    Google Scholar 

  16. McKeown, K.R., Radev, D.R.: Generating Summaries of Multiple News Articles. SIGIR. ACM, Seattle Washington (1995) 74–82

    Google Scholar 

  17. Myaeng, S.H., Jang, D.: Development and Evaluation of a Statistical Based Document System. In: Mani, I., Maybury, M. (eds.): Advances in Automated Text Summarization. MIT Press (1999) 61–70

    Google Scholar 

  18. Salton, G., Singhal, A., Mitra, M., Buckley, C.: Automatic Text Structuring and Summarization. Information Processing & Management, Vol. 33, No. 2. Elsevier (1997) 193–207

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yeh, JY., Ke, HR., Yang, WP. (2002). Chinese Text Summarization Using a Trainable Summarizer and Latent Semantic Analysis. In: Lim, E.P., et al. Digital Libraries: People, Knowledge, and Technology. ICADL 2002. Lecture Notes in Computer Science, vol 2555. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36227-4_8

Download citation

  • DOI: https://doi.org/10.1007/3-540-36227-4_8

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00261-1

  • Online ISBN: 978-3-540-36227-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics