Automatic Multi-document Summarization Based on New Sentence Similarity Measures

Yin, Wenpeng; Pei, Yulong; Huang, Lian’en

doi:10.1007/978-3-642-32695-0_81

Wenpeng Yin²²,
Yulong Pei²² &
Lian’en Huang²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7458))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

2934 Accesses
1 Citations

Abstract

The acquiring of sentence similarity has become a crucial step in graph-based multi-document summarization algorithms which have been intensively studied during the past decade. Previous algorithms generally considered sentence-level structure information and semantic similarity separately, which, consequently, had no access to grab similarity information comprehensively. In this paper, we present a general framework to exemplify how to combine the two factors above together so as to derive a corpus-oriented and more discriminative sentence similarity. Experimental results on the DUC2004 dataset demonstrate that our approaches could improve the multi-document summarization performance to a considerable extent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aliguliyev, R.M.: A new sentence similarity measure and sentence based extractive technique for automatic text summarization. Expert Systems with Applications 36(4), 7764–7772 (2009)
Article Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. The Journal of Machine Learning Research 3, 993–1022 (2003)
MATH Google Scholar
Erkan, G., Radev, D.R.: Lexpagerank: Prestige in multi-document text summarization. In: Proceedings of EMNLP, pp. 365–371 (2004)
Google Scholar
Li, Y., McLean, D., Bandar, Z.A., O’Shea, J.D., Crockett, K.: Sentence similarity based on semantic nets and corpus statistics. IEEE Transactions on Knowledge and Data Engineering 18(8), 1138–1150 (2006)
Article Google Scholar
Lin, C., Och, F.: Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p. 605. Association for Computational Linguistics (2004)
Google Scholar
Manning, C., Schutze, H.: Foundations of statistical natural language processing. In: Enhancing Semantic Distances With Context Awareness (1999)
Google Scholar
Mihalcea, R., Tarau, P.: Textrank: Bringing order into texts. In: Proceedings of EMNLP, pp. 404–411. ACL, Barcelona (2004)
Google Scholar
Wan, X.: Document-Based HITS Model for Multi-document Summarization. In: Ho, T.-B., Zhou, Z.-H. (eds.) PRICAI 2008. LNCS (LNAI), vol. 5351, pp. 454–465. Springer, Heidelberg (2008)
Chapter Google Scholar
Zhang, J., Sun, Y., Wang, H., He, Y.: Calculating statistical similarity between sentences. Journal of Convergence Information Technology 6(2) (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Shenzhen Key Lab for Cloud Computing Technology and Applications, Peking University Shenzhen Graduate School, Shenzhen, Guangdong, 518055, P.R. China
Wenpeng Yin, Yulong Pei & Lian’en Huang

Authors

Wenpeng Yin
View author publications
You can also search for this author in PubMed Google Scholar
Yulong Pei
View author publications
You can also search for this author in PubMed Google Scholar
Lian’en Huang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Environment, Society and Design, Department of Applied Computing, Lincoln University, P.O. Box 84, 7647, Christchurch, New Zealand
Patricia Anthony
School of Information Science and Technology, University of Tokyo, 7-3-1, Hongo, 113-8656, Bunkyo-ku, Tokyo, Japan
Mitsuru Ishizuka
MIMOS Berhad, Knowledge Technology, Technology Park Malaysia,, 57000, Kuala Lumpur, Malaysia
Dickson Lukose

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yin, W., Pei, Y., Huang, L. (2012). Automatic Multi-document Summarization Based on New Sentence Similarity Measures. In: Anthony, P., Ishizuka, M., Lukose, D. (eds) PRICAI 2012: Trends in Artificial Intelligence. PRICAI 2012. Lecture Notes in Computer Science(), vol 7458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32695-0_81

Download citation

DOI: https://doi.org/10.1007/978-3-642-32695-0_81
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32694-3
Online ISBN: 978-3-642-32695-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics