Abstract
With the increasing volume of online information, it is more important to automatically extract the core content from lots of information sources. We propose a model for multiple documents summarization that maximize the coverage of topics and minimize the redundancy of contents. Based on Chinese concept lexicon and corpus, the proposed model can analyze the topic of each document, their relationships and the central theme of the collection to evaluate sentences. We present different approaches to determine which sentences are appropriate for the extraction on the basis of sentences weight and their relevance from the related documents. A genetic algorithm is designed to improve the quality of the summarization. The experimental results indicate that it is useful and effective to improve the quality of multiple documents summarization using genetic algorithm.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Mani, I.: Automatic Summarization. John Benjamins, Amsterdam (2001)
Gregory, H.: An Efficient Text Summarizer using Lexical Chains. In: NAACL-ANLP 2000 Workshop (2000)
White, M., et al.: Multi-document Summarization via Information Extraction. In: First International Conference on Human Language Technology Research (HLT) (2001)
Fung, P., et al.: Combining Optimal Clustering and Hidden Markov Models for Extractive Summarization. In: Dignum, F.P.M. (ed.) ACL 2003. LNCS (LNAI), vol. 2922, Springer, Heidelberg (2004)
HTRDP Evaluations (2004), http://www.863data.org.cn/
Dragomir, R.: A common theory of information fusion from multiple text sources, step one: Crossdocument structure. In: Proceedings of the 1st ACL SIGDIAL 2000 (2000)
Liu, D., et al.: Study of concept cohesion based on lexicon and corpus. In: The 1st National Conference on Information Retrieval and Content Security (2004)
Zengdong, D., Qiang, D.: Hownet, http://www.keenage.com
Jiaju, M., Yiming, Z.: Synonym Thesaurus (1983)
Carbonell, J., et al.: The use of MMR, diversity-based reranking for reordering documents and producing summarization. In: Proceedings of SIGIN 1998 (1998)
Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press (1975)
Dragomir, R., et al.: Evaluation challenges in large-scale document summarization. In: Dignum, F.P.M. (ed.) ACL 2003. LNCS (LNAI), vol. 2922, Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, D., Wang, Y., Liu, C., Wang, Z. (2006). Multiple Documents Summarization Based on Genetic Algorithm. In: Wang, L., Jiao, L., Shi, G., Li, X., Liu, J. (eds) Fuzzy Systems and Knowledge Discovery. FSKD 2006. Lecture Notes in Computer Science(), vol 4223. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11881599_40
Download citation
DOI: https://doi.org/10.1007/11881599_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45916-3
Online ISBN: 978-3-540-45917-0
eBook Packages: Computer ScienceComputer Science (R0)