Research on Weighted Complex Network Based Keywords Extraction

Zhou, Zhi; Zou, Xiaojun; Lv, Xueqiang; Hu, Junfeng

doi:10.1007/978-3-642-45185-0_47

Zhi Zhou²¹,
Xiaojun Zou²¹,
Xueqiang Lv²² &
…
Junfeng Hu²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8229))

Included in the following conference series:

Workshop on Chinese Lexical Semantics

2449 Accesses
5 Citations
1 Altmetric

Abstract

Based on the complex network theory, this paper constructs a weighted lexical network to extract keywords from the text automatically. The current related researches mainly focus on the measures of nodes’ contribution to the whole network, while this paper lays emphasis on the construction of lexical network. By introducing linguistic knowledge, we center on reasonable selection of nodes, proper description of relationships between words, enhancement of node attributes, and etc. Experiments indicate that the lexical network constructed by our approach achieves preferable effect on accuracy, recall and F-value—when selecting the top three results, the three indices increase by 6.67%, 3.96% and 4.97% on average than the classic TF-IDF method respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Luhn, H.P.: A Statistical Approach to Mechanized Encoding and Searching of Literary Information. IBM Journal of Research and Development 1(4), 309–317 (1957)
Article MathSciNet Google Scholar
Salton, G., Buckley, C.: Term-weighting Approaches in Automatic Text Retrieval. Information Processing and Management 24(5), 513–523 (1988)
Article Google Scholar
Turney, P.D.: Learning to Extract Keyphrases from Text. National Research Council, Canada, NRC Technical Report ERB-1057 (1999)
Google Scholar
Frank, E., Paynter, G.W., Witten, I.H.: Domain-specific Keyphrase Extraction. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 668–673. Morgan Kaufmann (1999)
Google Scholar
Li, S.J., Wang, H.F., Yu, S.W., Xin, C.S.: Research on Maximum Entropy Model for Keyword Indexing. Chinese Journal of Computers 27(9), 1192–1197 (2004) (in Chinese)
Google Scholar
Mihalcea, R., Tarau, P.: TextRank: Bringing Order into Texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), pp. 404–411 (2004)
Google Scholar
Matsuo, Y., Ohsawa, Y., Ishizuka, M.: KeyWorld: Extracting Keywords from A Document as A Small World. In: Jantke, K.P., Shinohara, A. (eds.) DS 2001. LNCS (LNAI), vol. 2226, pp. 271–281. Springer, Heidelberg (2001)
Chapter Google Scholar
Zhu, M., Cai, Z., Cai, Q.: Automatic Keywords Extraction of Chinese Document Using Small World. In: Proceedings of the International Conference on Natural Language Processing and Knowledge Engineering, pp. 438–443 (2003)
Google Scholar
Ma, L., Jiao, L., Bai, L., Zhou, Y., Dong, L.: Research on a Compound Keywords Detection Method Based on Small World Model. Journal of Chinese Information Processing 3(23), 121–128 (2009) (in Chinese)
Google Scholar
Liu, H.: The Theoretical and Experimental Justifications of Phrase Advantage in Chinese Information Processing. Applied Linguistics (4), 129–135 (2007) (in Chinese)
Google Scholar
Zhan, W.D.: A Study of Constructing Rules of Phrases in Contemporary Chinese for Chinese Information Processing. Doctoral Thesis. Peking University (1999) (in Chinese)
Google Scholar
Zhao, P., Cai, Q.S., Wang, Q.Y., Geng, H.T.: An Automatic Keyword Extraction of Chinese Document Algorithm Based on Complex Network Features. Pattern Recognition and Artificial Intelligence 20(6), 827–831 (2007) (in Chinese)
Google Scholar
Cancho, R.F.I., Sole, R.V.: The Small World of Human Language. Proceedings of the Royal Society of London, Series B, Biological Sciences 268(1482), 2261–2265 (2001)
Article Google Scholar
Liu, Z.Y., Sun, M.S.: Chinese Word Co-occurrence Network: Its Small World Effect and Scale-free Property. Journal of Chinese Information Processing 21(6), 52–58 (2007) (in Chinese)
Google Scholar
Lin, Y.X., Liang, Y.H., Han, Y., Zhang, Y.G., Yao, J.M.: Keyphrase Extraction of Chinese Documents Based on Weighted Complex Network. Microelectronics & Computer 26(10), 65–73 (2009) (in Chinese)
Google Scholar
Zhang, H., Liu, Q., Cheng, X., Zhang, H., Yu, H.: Chinese Lexical Analysis Using Hierarchical Hidden Markov Model. In: Proceedings of the Second SIGHAN Workshop, pp. 63–70 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Key Laboratory of Computational Linguistics, Ministry of Education, Peking University, Beijing, 100871, P.R. China
Zhi Zhou, Xiaojun Zou & Junfeng Hu
Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science and Technology University, Beijing, 100101, P.R. China
Xueqiang Lv

Authors

Zhi Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojun Zou
View author publications
You can also search for this author in PubMed Google Scholar
Xueqiang Lv
View author publications
You can also search for this author in PubMed Google Scholar
Junfeng Hu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Applied Language Research Institute, Beijing Language and Culture University, No. 15 Xueyuan Road, Haidian District, 100083, Beijing, China
Pengyuan Liu
School of Foreign Languages, Peking University, No. 5, Yiheyuan Road, Haidian District, 100871, Beijing, China
Qi Su

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, Z., Zou, X., Lv, X., Hu, J. (2013). Research on Weighted Complex Network Based Keywords Extraction. In: Liu, P., Su, Q. (eds) Chinese Lexical Semantics. CLSW 2013. Lecture Notes in Computer Science(), vol 8229. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45185-0_47

Download citation

DOI: https://doi.org/10.1007/978-3-642-45185-0_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45184-3
Online ISBN: 978-3-642-45185-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics