Abstract
Automatic text summarization has been an growingly important task since a huge amount of textual information needs to be processed on the Internet. Genetic Algorithm (GA) is an efficient approach for extractive text summarization, which aims to find out the best summary with an optimized fitness function through the evolution of generations. This paper proposes a novel extractive summarization method using GA with two types of individuals based on their internal chromosomal structure. Each individual may have one or two full chromosomes, where a chromosome represents a candidate summary. In this type-based GA, good summaries are better kept through generations, the mutation more likely happens with more flexible strategies and prominent summaries are more likely found in the solution space. The mutation can occur in two levels: off-springs can be obtained by changing their parents’ type or flipping some genes, i.e. multi-point, in their parents’ chromosomes. Our proposed approach has been experimented on DUC2001, DUC2002 and CNN/DailyMail datasets, outperforming all other extractive state-of-the-art methods by all three Rouge points. Indeed, the Rouge-1 and Rouge-L scores considerably improve from 10% to 20%, while the Rouge-2 has the highest performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The formula of these features are detailed in the work of Anh et al. [2].
- 2.
The improvement score is calculated as the ratio of the difference on the Rouge Score between our proposed method and the comparing method and the Rouge score of this method.
References
Al-Abdallah, R.Z., Al-Taani, A.T.: Arabic single-document text summarization using particle swarm optimization algorithm. Procedia Comput. Sci. 117, 30–37 (2017)
Anh, B.T.M., My, N.T., Trang, N.T.T.: Enhanced genetic algorithm for single document extractive summarization. In: Proceedings of the Tenth International Symposium on Information and Communication Technology, pp. 370–376 (2019)
Ansótegui, C., Sellmann, M., Tierney, K.: A gender-based genetic algorithm for the automatic configuration of algorithms. In: Gent, I.P. (ed.) CP 2009. LNCS, vol. 5732, pp. 142–157. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04244-7_14
García-Hernández, R.A., Ledeneva, Y.: Single extractive text summarization based on a genetic algorithm. In: Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Rodríguez, J.S., di Baja, G.S. (eds.) MCPR 2013. LNCS, vol. 7914, pp. 374–383. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38989-4_38
Hahn, U., Mani, I.: The challenges of automatic summarization. Computer 33(11), 29–36 (2000)
Hermann, K.M., et al.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems, pp. 1693–1701 (2015)
Lin, C.Y.: Rouge: A package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81 (2004)
Meena, Y.K., Gopalani, D.: Evolutionary algorithms for extractive automatic text summarization. Procedia Comput. Sci. 48, 244–249 (2015)
Mendoza, M., Bonilla, S., Noguera, C., Cobos, C., León, E.: Extractive single-document summarization based on genetic operators and guided local search. Expert Syst. Appl. 41(9), 4158–4169 (2014)
Mihalcea, R., Tarau, P.: Textrank: bringing order into text. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pp. 404–411 (2004)
Miller, G.F., Todd, P.M.: The role of mate choice in biocomputation: sexual selection as a process of search, optimization, and diversification. In: Banzhaf, W., Eeckman, F.H. (eds.) Evolution and Biocomputation. LNCS, vol. 899, pp. 169–204. Springer, Heidelberg (1995). https://doi.org/10.1007/3-540-59046-3_10
Nallapati, R., Zhai, F., Zhou, B.: Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Over, P., Liggett, W.: Introduction to DUC: an intrinsic evaluation of generic news text summarization systems. In: Proceedings of DUC (2002). http://wwwnlpir.nist.gov/projects/duc/guidelines/2002.html
Sanchez-Gomez, J.M., Vega-Rodríguez, M.A., Perez, C.J.: A decomposition-based multi-objective optimization approach for extractive multi-document text summarization. Appl. Soft Comput. 91, 106231 (2020)
See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368 (2017)
Simón, J.R., Ledeneva, Y., García-Hernández, R.A.: Calculating the significance of automatic extractive text summarization using a genetic algorithm. J. Intell. Fuzzy Syst. 35(1), 293–304 (2018)
Sizov, R., Simovici, D.A.: Type-based genetic algorithms. In: Kotenko, I., Badica, C., Desnitsky, V., El Baz, D., Ivanovic, M. (eds.) IDC 2019. SCI, vol. 868, pp. 170–176. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-32258-8_19
Suanmali, L., Salim, N., Binwahlan, M.S.: Genetic algorithm based sentence extraction for text summarization. Int. J. Innov. Comput. 1(1) (2011)
Thede, S.M.: An introduction to genetic algorithms. J. Comput. Sci. Coll. 20(1), 115–123 (2004)
Vázquez, E., Arnulfo Garcia-Hernandez, R., Ledeneva, Y.: Sentence features relevance for extractive text summarization using genetic algorithms. J. Intell. Fuzzy Syst. 35(1), 353–365 (2018)
Wong, K.F., Wu, M., Li, W.: Extractive summarization using supervised and semi-supervised learning. In: Proceedings of the 22nd International Conference on Computational Linguistics, vol. 1, pp. 985–992. Association for Computational Linguistics (2008)
Yang, L., Cai, X., Zhang, Y., Shi, P.: Enhancing sentence-level clustering with ranking-based clustering framework for theme-based summarization. Inf. Sci. 260, 37–50 (2014)
Zhang, J., Zhao, Y., Saleh, M., Liu, P.: Pegasus: pre-training with extracted gap-sentences for abstractive summarization. In: International Conference on Machine Learning, pp. 11328–11339. PMLR (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Anh, B.T.M., Trang, N.T.T., Dinh, T.T. (2022). A Novel Type-Based Genetic Algorithm for Extractive Summarization. In: Fujita, H., Fournier-Viger, P., Ali, M., Wang, Y. (eds) Advances and Trends in Artificial Intelligence. Theory and Practices in Artificial Intelligence. IEA/AIE 2022. Lecture Notes in Computer Science(), vol 13343. Springer, Cham. https://doi.org/10.1007/978-3-031-08530-7_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-08530-7_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08529-1
Online ISBN: 978-3-031-08530-7
eBook Packages: Computer ScienceComputer Science (R0)