Multiple Documents Summarization Based on Genetic Algorithm

Liu, Derong; Wang, Yongcheng; Liu, Chuanhan; Wang, Zhiqi

doi:10.1007/11881599_40

Derong Liu^23,24,
Yongcheng Wang²³,
Chuanhan Liu²³ &
…
Zhiqi Wang²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4223))

Included in the following conference series:

International Conference on Fuzzy Systems and Knowledge Discovery

1314 Accesses

Abstract

With the increasing volume of online information, it is more important to automatically extract the core content from lots of information sources. We propose a model for multiple documents summarization that maximize the coverage of topics and minimize the redundancy of contents. Based on Chinese concept lexicon and corpus, the proposed model can analyze the topic of each document, their relationships and the central theme of the collection to evaluate sentences. We present different approaches to determine which sentences are appropriate for the extraction on the basis of sentences weight and their relevance from the related documents. A genetic algorithm is designed to improve the quality of the summarization. The experimental results indicate that it is useful and effective to improve the quality of multiple documents summarization using genetic algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Multi-document Text Summarization Based on Genetic Algorithm and the Relevance of Sentence Features

Context-Based Multi-document Summarization

Abstractive Multi-Document Text Summarization Using a Genetic Algorithm

References

Mani, I.: Automatic Summarization. John Benjamins, Amsterdam (2001)
MATH Google Scholar
Gregory, H.: An Efficient Text Summarizer using Lexical Chains. In: NAACL-ANLP 2000 Workshop (2000)
Google Scholar
White, M., et al.: Multi-document Summarization via Information Extraction. In: First International Conference on Human Language Technology Research (HLT) (2001)
Google Scholar
Fung, P., et al.: Combining Optimal Clustering and Hidden Markov Models for Extractive Summarization. In: Dignum, F.P.M. (ed.) ACL 2003. LNCS (LNAI), vol. 2922, Springer, Heidelberg (2004)
Google Scholar
HTRDP Evaluations (2004), http://www.863data.org.cn/
Dragomir, R.: A common theory of information fusion from multiple text sources, step one: Crossdocument structure. In: Proceedings of the 1st ACL SIGDIAL 2000 (2000)
Google Scholar
Liu, D., et al.: Study of concept cohesion based on lexicon and corpus. In: The 1st National Conference on Information Retrieval and Content Security (2004)
Google Scholar
Zengdong, D., Qiang, D.: Hownet, http://www.keenage.com
Jiaju, M., Yiming, Z.: Synonym Thesaurus (1983)
Google Scholar
Carbonell, J., et al.: The use of MMR, diversity-based reranking for reordering documents and producing summarization. In: Proceedings of SIGIN 1998 (1998)
Google Scholar
Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press (1975)
Google Scholar
Dragomir, R., et al.: Evaluation challenges in large-scale document summarization. In: Dignum, F.P.M. (ed.) ACL 2003. LNCS (LNAI), vol. 2922, Springer, Heidelberg (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Comp. Sci. and Engineering, Shanghai Jiao Tong University,
Derong Liu, Yongcheng Wang, Chuanhan Liu & Zhiqi Wang
Merchant Marine College, Shanghai Maritime University,
Derong Liu

Authors

Derong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yongcheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chuanhan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhiqi Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Electrical and Electronic Engineering, Nanyang Technological University,, Block S1, Nanyang Avenue, 639798, Singapore
Lipo Wang
Life Science Research Center, School of Electronic Engineering, Xidian University,, 710071, Xi’an, Shaanxi, China
Licheng Jiao
School of Electrical and Electronic Engineering, Xidian University, 710071, Xi’an, China
Guanming Shi
School of Information Technology and Electrical Engineering, The University of Queensland, 4072, Brisbane, Queensland, Australia
Xue Li
College of Mathematics and Information Science, Hebei Normal University, 050016, Shijiazhuang, Hebei, P.R. China
Jing Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, D., Wang, Y., Liu, C., Wang, Z. (2006). Multiple Documents Summarization Based on Genetic Algorithm. In: Wang, L., Jiao, L., Shi, G., Li, X., Liu, J. (eds) Fuzzy Systems and Knowledge Discovery. FSKD 2006. Lecture Notes in Computer Science(), vol 4223. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11881599_40

Download citation

DOI: https://doi.org/10.1007/11881599_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45916-3
Online ISBN: 978-3-540-45917-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics