Skip to main content

Multiple Documents Summarization Based on Genetic Algorithm

  • Conference paper
Fuzzy Systems and Knowledge Discovery (FSKD 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4223))

Included in the following conference series:

  • 1314 Accesses

Abstract

With the increasing volume of online information, it is more important to automatically extract the core content from lots of information sources. We propose a model for multiple documents summarization that maximize the coverage of topics and minimize the redundancy of contents. Based on Chinese concept lexicon and corpus, the proposed model can analyze the topic of each document, their relationships and the central theme of the collection to evaluate sentences. We present different approaches to determine which sentences are appropriate for the extraction on the basis of sentences weight and their relevance from the related documents. A genetic algorithm is designed to improve the quality of the summarization. The experimental results indicate that it is useful and effective to improve the quality of multiple documents summarization using genetic algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Mani, I.: Automatic Summarization. John Benjamins, Amsterdam (2001)

    MATH  Google Scholar 

  2. Gregory, H.: An Efficient Text Summarizer using Lexical Chains. In: NAACL-ANLP 2000 Workshop (2000)

    Google Scholar 

  3. White, M., et al.: Multi-document Summarization via Information Extraction. In: First International Conference on Human Language Technology Research (HLT) (2001)

    Google Scholar 

  4. Fung, P., et al.: Combining Optimal Clustering and Hidden Markov Models for Extractive Summarization. In: Dignum, F.P.M. (ed.) ACL 2003. LNCS (LNAI), vol. 2922, Springer, Heidelberg (2004)

    Google Scholar 

  5. HTRDP Evaluations (2004), http://www.863data.org.cn/

  6. Dragomir, R.: A common theory of information fusion from multiple text sources, step one: Crossdocument structure. In: Proceedings of the 1st ACL SIGDIAL 2000 (2000)

    Google Scholar 

  7. Liu, D., et al.: Study of concept cohesion based on lexicon and corpus. In: The 1st National Conference on Information Retrieval and Content Security (2004)

    Google Scholar 

  8. Zengdong, D., Qiang, D.: Hownet, http://www.keenage.com

  9. Jiaju, M., Yiming, Z.: Synonym Thesaurus (1983)

    Google Scholar 

  10. Carbonell, J., et al.: The use of MMR, diversity-based reranking for reordering documents and producing summarization. In: Proceedings of SIGIN 1998 (1998)

    Google Scholar 

  11. Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press (1975)

    Google Scholar 

  12. Dragomir, R., et al.: Evaluation challenges in large-scale document summarization. In: Dignum, F.P.M. (ed.) ACL 2003. LNCS (LNAI), vol. 2922, Springer, Heidelberg (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liu, D., Wang, Y., Liu, C., Wang, Z. (2006). Multiple Documents Summarization Based on Genetic Algorithm. In: Wang, L., Jiao, L., Shi, G., Li, X., Liu, J. (eds) Fuzzy Systems and Knowledge Discovery. FSKD 2006. Lecture Notes in Computer Science(), vol 4223. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11881599_40

Download citation

  • DOI: https://doi.org/10.1007/11881599_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45916-3

  • Online ISBN: 978-3-540-45917-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics