Query-Focused Summarization by Combining Topic Model and Affinity Propagation

Chen, Dewei; Tang, Jie; Yao, Limin; Li, Juanzi; Zhou, Lizhu

doi:10.1007/978-3-642-00672-2_17

Dewei Chen²²,
Jie Tang²²,
Limin Yao²³,
Juanzi Li²² &
…
Lizhu Zhou²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5446))

Included in the following conference series:

1231 Accesses
3 Citations

Abstract

The goal of query-focused summarization is to extract a summary for a given query from the document collection. Although much work has been done for this problem, there are still many challenging issues: (1) The length of the summary is predefined by, for example, the number of word tokens or the number of sentences. (2) A query usually asks for information of several perspectives (topics); however existing methods cannot capture topical aspects with respect to the query. In this paper, we propose a novel approach by combining statistical topic model and affinity propagation. Specifically, the topic model, called qLDA, can simultaneously model documents and the query. Moreover, the affinity propagation can automatically discover key sentences from the document collection without predefining the length of the summary. Experimental results on DUC05 and DUC06 data sets show that our approach is effective and the summarization performance is better than baseline methods.

The work is supported by NSFC (60703059), Chinese National Key Foundation Research and Development Plan (2007CB310803), and Chinese Young Faculty Funding (20070003093).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barzilay, R., Lee, L.: Catching the drift: probabilistic content models, with applications to generation and summarization. In: Proceedings of HLT-NAACL 2004 (2004)
Google Scholar
Bhandari, H., Shimbo, M., Ito, T., Matsumoto, Y.: Generic text summarization using probabilistic latent semantic indexing. In: Proceedings of IJCNLP 2008 (2008)
Google Scholar
Blei, D., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. JMLR 3, 993–1022 (2003)
MATH Google Scholar
Blei, D., Griffiths, T., Jordan, M., Tenenbaum, J.: Hierarchical topic models and the nested Chinese restaurant process. In: Proceedings of NIPS 2004 (2004)
Google Scholar
Chen, B., Chen, Y.: Word Topical Mixture Models for Extractive Spoken Document Summarization. In: Proceedings of ICME 2007 (2007)
Google Scholar
Conroy, J., Schlesinger, J., O’Leary, D.: Topic Focused Multi-document Summarization Using an Approximate Oracle Score. In: Proceedings of ACL 2006 (2006)
Google Scholar
Daumé III, H., Marcu, D.: Bayesian Query-Focused Summarization. In: Proceedings of ACL 2006 (2006)
Google Scholar
Frey, B., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)
Article MathSciNet MATH Google Scholar
Griffiths, T., Steyvers, M.: Finding scientific topics. In: Proceedings of NAS, pp. 5228–5235 (2004)
Google Scholar
Harabagiu, S., Lacatusu, F.: Topic Themes for Multi-Document Summarization. In: Proceedings of SIGIR 2005 (2005)
Google Scholar
Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of SIGIR 1999 (1999)
Google Scholar
Kong, S., Lee, L.: Improved Spoken Document Summarization Using Probabilistic Latent Semantic Analysis (PLSA). In: Proceedings of ICASS 2006 (2006)
Google Scholar
Kumar, R., Mahadevan, U., Sivakumar, D.: A graph-theoretic approach to extract storylines from search results. In: Proceedings of KDD 2004, pp. 216–225 (2004)
Google Scholar
Kullback, S., Leibler, R.A.: On information and sufficiency. Annals of Mathematical Statistics, vol. 22, pp. 79–86 (1951)
Google Scholar
Li, W., Li, W., Li, B., Chen, Q., Wu, M.: The Hong Kong Polytechnic University at DUC2005. In: Proceedings of DUC 2005 (2005)
Google Scholar
Lin, C., Hovy, E.: The Automatic Acquisition of Topic Signatures for Text Summarization. In: Proceedings of COLING 2000 (2000)
Google Scholar
Lin, C., Hovy, E.: Automatic evaluation of summaries using N-gram co-occurrence statistics. In: Proceedings of HLT-NAACL 2003 (2003)
Google Scholar
Mei, Q., Ling, X., Wondra, M., Su, H., Zhai, C.: Topic Sentiment Mixture: Modeling Facets and Opinions in Weblogs. In: Proceedings of WWW 2007 (2007)
Google Scholar
Nenkova, A., Vanderwende, L., McKeown, K.: A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization. In: Proceedings of SIGIR 2006 (2006)
Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: PageRank Bringing Order to the Web. Stanford University (1999)
Google Scholar
Steyvers, M., Smyth, P., Griffiths, T.: Probabilistic author topic models for information discovery. In: Proceedings of SIGKDD 2004, pp. 306–315 (2004)
Google Scholar
Tang, J., Yao, L., Chen, D.: Multi-topic based query-oriented summarization. In: Proceedings of SDM 2009 (2009)
Google Scholar
Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: ArnetMiner: extraction and mining of academic social networks. In: Proceedings of SIGKDD 2008, pp. 990–998 (2008)
Google Scholar
Wei, X., Bruce Croft, W.: LDA-based document models for Ad-hoc retrieval. In: Proceedings of SIGIR 2006 (2006)
Google Scholar
Ye, S., Qiu, L., Chua, T., Kan, M.: NUS at DUC2005: Understanding documents via concept links. In: Proceedings of DUC 2005 (2005)
Google Scholar
Yih, W., Goodman, J., Vanderwende, L., Suzuki, H.: Multi-document summarization by maximizing informative content-words. In: Proceedings of IJCAI 2007 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, Tsinghua University, China
Dewei Chen, Jie Tang, Juanzi Li & Lizhu Zhou
Department of Computer Science, University of Massachusetts Amherst, USA
Limin Yao

Authors

Dewei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jie Tang
View author publications
You can also search for this author in PubMed Google Scholar
Limin Yao
View author publications
You can also search for this author in PubMed Google Scholar
Juanzi Li
View author publications
You can also search for this author in PubMed Google Scholar
Lizhu Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon, Hong Kong, China
Qing Li
Department of Computer Science & Technology, Tsinghua University, Beijing, China
Ling Feng
School of Computing Science, Simon Fraser University, 8888 University Drive, V5A 1S6, Burnaby BC, Canada
Jian Pei
Department of Computer Science, University of Vermont, VT 05405, Burlington, USA
Sean X. Wang
School of Information Technology and Electrical Engineering, The University of Queensland, QLD 4072, Brisbane, Australia
Xiaofang Zhou
Jiangsu Provincial Key Lab of Computer Information Processing Technology School of Computer Science & Technology, Soochow University China, 1 shizi Street Suzhou, 215006, Jiangsu, China
Qiao-Ming Zhu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, D., Tang, J., Yao, L., Li, J., Zhou, L. (2009). Query-Focused Summarization by Combining Topic Model and Affinity Propagation. In: Li, Q., Feng, L., Pei, J., Wang, S.X., Zhou, X., Zhu, QM. (eds) Advances in Data and Web Management. APWeb WAIM 2009 2009. Lecture Notes in Computer Science, vol 5446. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00672-2_17

Download citation

DOI: https://doi.org/10.1007/978-3-642-00672-2_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00671-5
Online ISBN: 978-3-642-00672-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics