Skip to main content

Topic Decomposition and Summarization

  • Conference paper
Book cover Advances in Knowledge Discovery and Data Mining (PAKDD 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6118))

Included in the following conference series:

Abstract

In this paper, we study topic decomposition and summarization for a temporal-sequenced text corpus of a specific topic. The task is to discover different topic aspects (i.e., sub-topics) and incidents related to each sub-topic of the text corpus, and generate summaries for them. We present a solution with the following steps: (1) deriving sub-topics by applying Non-negative Matrix Factorization (NMF) to terms-by-sentences matrix of the text corpus; (2) detecting incidents of each sub-topic and generating summaries for both sub-topic and its incidents by examining the constitution of its encoding vector generated by NMF; (3) ranking each sentences based on the encoding matrix and selecting top ranked sentences of each sub-topic as the text corpus’ summary. Experimental results show that the proposed topic decomposition method can effectively detect various aspects of original documents. Besides, the topic summarization method achieves better results than some well-studied methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chen, C.C., Chen, M.C.: TSCAN: A Novel Method for Topic Summarization and Content Anatomy. In: Proc. of the 31st ACM SIGIR conference, pp. 579–586. ACM, USA (2008)

    Google Scholar 

  2. Deerwester, S., Dumais, S.T., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6), 391–407 (1990)

    Article  Google Scholar 

  3. Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)

    Article  Google Scholar 

  4. Xu, W., Liu, X., Gong, Y.H.: Document Clustering Based on Non-negative Matrix Factorization. In: Proc. of the 26th ACM SIGIR conference, pp. 267–273. ACM, USA (2003)

    Google Scholar 

  5. Strang, G.: Introduction to Linear Algebra. Wellesley Cambridge Press, Wellesley (2003)

    Google Scholar 

  6. Gong, Y.H., Liu, X.: Generic Text Summarization Using Relevance Measure and Latent Semantic Analysis. In: Proc. of the 24th ACM SIGIR conference, pp. 19–25. ACM, USA (2001)

    Google Scholar 

  7. Zha, H.Y.: Generic Summarization and Keyphrase Extraction Using Mutual Reinforcement Principle and Sentence Clustering. In: Proc. of 25th ACM SIGIR, pp. 113–120 (2002)

    Google Scholar 

  8. Wan, X.J., Yang, J.W., Xiao, J.G.: Manifold-Ranking Based Topic-Focused Multi-Document Summarization. In: Proc. of IJCAI, pp. 2903–2908. ACM, USA (2007)

    Google Scholar 

  9. Lee, J.H., Park, S., Ahn, C.M., Kim, D.: Automatic generic document summarization based on non-negative matrix factorization. Info. Processing and Management 45, 20–34 (2009)

    Article  Google Scholar 

  10. Document Understanding Conferences (2004), http://www-nlpir.nist.gov/projects/duc/index.html

  11. Vlachos, M., Meek, C., Vagena, Z., Gunopulos, D.: Identifying Similarities, Periodicities and Bursts for Search Queries. In: Proc. of ACM SIGMOD, pp. 131–142. ACM, USA (2004)

    Chapter  Google Scholar 

  12. Lin, C.Y.: ROUGE: a Package for Automatic Evaluation of Summaries. In: Proc. of the Workshop on Text Summarization Branches Out, Barcelona, Spain, pp. 74–81 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, W., Wang, C., Chen, C., Zhang, L., Bu, J. (2010). Topic Decomposition and Summarization. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2010. Lecture Notes in Computer Science(), vol 6118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13657-3_47

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13657-3_47

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13656-6

  • Online ISBN: 978-3-642-13657-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics