An Improved TextRank-Based Method for Chinese Text Summarization

Zheng, Xin; Zhou, Tiantian; Wang, Yintong; Li, Shuo

doi:10.1007/978-3-031-06788-4_12

Xin Zheng¹¹,
Tiantian Zhou¹¹,
Yintong Wang¹¹ &
…
Shuo Li¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13339))

Included in the following conference series:

International Conference on Artificial Intelligence and Security

1078 Accesses

Abstract

Text summarization extraction is a natural language processing technology that can quickly extract key information to improve the efficiency of daily transaction processing. The traditional TextRank algorithm is limited by Chinese word segmentation, which results in the low accuracy of Chinese text summarization, and leads to loss important information. This paper proposes an Improved TextRank-based method for Chinese text summarization, which makes full use of the 1-g model analysis in N-Gram to obtain the candidate word vector, and combines the 2-g model and hidden Markov model to achieve part-of-speech tagging and optimized word segmentation. Finally use the improved TextRank model on the optimized word vector to realize the Chinese text summarization extraction. The experimental results show that when the number of summary sentences is no more than 8, the accuracy of Chinese text summary extraction by Improved TextRank-based method is significantly better than traditional TF-IDF and TextRank methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Heidary, E., et al.: Automatic text summarization using genetic algorithm and repetitive patterns. Comput. Mater. Continua 67(1), 1085–1101 (2021)
Article Google Scholar
Ying, S., Li, W., He, B., Wang, W., Wan, Y.: City address set chinese word segmentation based on statistical decision tree. Geomatics Inf. Sci. Wuhan Univ. 44(02), 302–309 (2019)
Google Scholar
Qi, T., Guo, Y., Wang, J., Wang, Z., Cheng, Z.: Sentiment analysis of foreign exchange news based on machine learning. Comput. Eng. Design 41(06), 1742–1748 (2020)
Google Scholar
Tang, H., Liu, Y., Zheng, H., Dou, Q., Lu, M.: Imbalanced text categorization method with slda topic model. Comput. Eng. Appl. 57(12), 144–154 (2021)
Google Scholar
Wang, J., Wu, G., Zhou, Y., Zhang, F.: Research on automatic summarization of web document guided by discourse. J. Comput. Res. Dev. 40(3), 398–405 (2003)
Google Scholar
Zhang, Q., Huang, X., Wu, L.: A new method for calculating similarity between sentences and application on automatic text summarization 19(02), 93–99 (2005)
Google Scholar
Lang, D., Liu, C., Feng, X., Liu, L., Huang, Q.: Design and implemention of a key phrases extraction scheme in the text based on lda and textrank. Comput. Appl. Softw. 35(03), 54–60 (2018)
Google Scholar
Li, R., Wu, Y., Wang, S., Chen, H., Liao, J.: A pagerank-based network layout algorithm. Comput. Measurement Control 28(02), 250–257 (2020)
Google Scholar
Schonlau, M., Guenther, N., Sucholutsky, I.: Text mining with n-gram variables. Stand. Genomic Sci. 17(4), 866–881 (2017)
Google Scholar
Jin, J., Xu, Y., Liu, Z.: Paper relational network mining based on pagerank. J. China Acad. Electron. Inf. Technol. 14(09), 924–928 (2019)
Google Scholar
Li, Z., Pan, S., Dai, J., Hu, J.: An improved textrank keyword extraction algorithm. Comput. Technol. Dev. 30(03), 77–81 (2020)
Google Scholar
Zhang, Z., Wang, T.: Research on taxing optimization for aircraft based on improved dijkstra algorithm. Aeronautical Comput. Tech. 48(06), 1–5 (2018)
Google Scholar
Garg, D.: Dynamizing Dijkstra: a solution to dynamic shortest path problem through retroactive priority queue. J. King Saud Univ. Comput. Inf. Sci. 33(3), 364–373 (2021)
Google Scholar
Chang, Y., Wang, X., Xue, M., Liu, Y., Jiang, F.: Improving language translation using the hidden Markov model. Comput. Mater. Continua 67(3), 3921–3931 (2021)
Article Google Scholar
Hou, C., Gulila, A., Chen, J.: Research on kazakh part-of-speech tagging based on hidden Markov models. Comput. Appl. Softw. 29(02), 31–33 (2012)
Google Scholar
Sohu News Set. https://www.sogou.com/labs/resource/cs.php

Download references

Acknowledgement

This work was supported in part by the Natural Science Foundation of Jiangsu Province under Grant BK20180142, and the Innovation and Entrepreneurship Training Program for College Students in Jiangsu Province (201811460041X).

Author information

Authors and Affiliations

School of Information Engineering, Nanjing Xiaozhuang University, Nanjing, 211171, China
Xin Zheng, Tiantian Zhou & Yintong Wang
Institute of Artificial Intelligence, De Montfort University, Leicester, LE1 9BH, UK
Shuo Li

Authors

Xin Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Tiantian Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yintong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shuo Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The authors declcare that they have no conflicts of interest to report regarding the present study.

Corresponding author

Correspondence to Xin Zheng .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Xingming Sun
Nanjing University of Information Science and Technology, Nanjing, China
Xiaorui Zhang
Jinan University, Guangzhou, China
Zhihua Xia
Purdue University, West Lafayette, IN, USA
Elisa Bertino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, X., Zhou, T., Wang, Y., Li, S. (2022). An Improved TextRank-Based Method for Chinese Text Summarization. In: Sun, X., Zhang, X., Xia, Z., Bertino, E. (eds) Artificial Intelligence and Security. ICAIS 2022. Lecture Notes in Computer Science, vol 13339. Springer, Cham. https://doi.org/10.1007/978-3-031-06788-4_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-06788-4_12
Published: 04 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-06787-7
Online ISBN: 978-3-031-06788-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics