Abstract
Based on the complex network theory, this paper constructs a weighted lexical network to extract keywords from the text automatically. The current related researches mainly focus on the measures of nodes’ contribution to the whole network, while this paper lays emphasis on the construction of lexical network. By introducing linguistic knowledge, we center on reasonable selection of nodes, proper description of relationships between words, enhancement of node attributes, and etc. Experiments indicate that the lexical network constructed by our approach achieves preferable effect on accuracy, recall and F-value—when selecting the top three results, the three indices increase by 6.67%, 3.96% and 4.97% on average than the classic TF-IDF method respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Luhn, H.P.: A Statistical Approach to Mechanized Encoding and Searching of Literary Information. IBM Journal of Research and Development 1(4), 309–317 (1957)
Salton, G., Buckley, C.: Term-weighting Approaches in Automatic Text Retrieval. Information Processing and Management 24(5), 513–523 (1988)
Turney, P.D.: Learning to Extract Keyphrases from Text. National Research Council, Canada, NRC Technical Report ERB-1057 (1999)
Frank, E., Paynter, G.W., Witten, I.H.: Domain-specific Keyphrase Extraction. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 668–673. Morgan Kaufmann (1999)
Li, S.J., Wang, H.F., Yu, S.W., Xin, C.S.: Research on Maximum Entropy Model for Keyword Indexing. Chinese Journal of Computers 27(9), 1192–1197 (2004) (in Chinese)
Mihalcea, R., Tarau, P.: TextRank: Bringing Order into Texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), pp. 404–411 (2004)
Matsuo, Y., Ohsawa, Y., Ishizuka, M.: KeyWorld: Extracting Keywords from A Document as A Small World. In: Jantke, K.P., Shinohara, A. (eds.) DS 2001. LNCS (LNAI), vol. 2226, pp. 271–281. Springer, Heidelberg (2001)
Zhu, M., Cai, Z., Cai, Q.: Automatic Keywords Extraction of Chinese Document Using Small World. In: Proceedings of the International Conference on Natural Language Processing and Knowledge Engineering, pp. 438–443 (2003)
Ma, L., Jiao, L., Bai, L., Zhou, Y., Dong, L.: Research on a Compound Keywords Detection Method Based on Small World Model. Journal of Chinese Information Processing 3(23), 121–128 (2009) (in Chinese)
Liu, H.: The Theoretical and Experimental Justifications of Phrase Advantage in Chinese Information Processing. Applied Linguistics (4), 129–135 (2007) (in Chinese)
Zhan, W.D.: A Study of Constructing Rules of Phrases in Contemporary Chinese for Chinese Information Processing. Doctoral Thesis. Peking University (1999) (in Chinese)
Zhao, P., Cai, Q.S., Wang, Q.Y., Geng, H.T.: An Automatic Keyword Extraction of Chinese Document Algorithm Based on Complex Network Features. Pattern Recognition and Artificial Intelligence 20(6), 827–831 (2007) (in Chinese)
Cancho, R.F.I., Sole, R.V.: The Small World of Human Language. Proceedings of the Royal Society of London, Series B, Biological Sciences 268(1482), 2261–2265 (2001)
Liu, Z.Y., Sun, M.S.: Chinese Word Co-occurrence Network: Its Small World Effect and Scale-free Property. Journal of Chinese Information Processing 21(6), 52–58 (2007) (in Chinese)
Lin, Y.X., Liang, Y.H., Han, Y., Zhang, Y.G., Yao, J.M.: Keyphrase Extraction of Chinese Documents Based on Weighted Complex Network. Microelectronics & Computer 26(10), 65–73 (2009) (in Chinese)
Zhang, H., Liu, Q., Cheng, X., Zhang, H., Yu, H.: Chinese Lexical Analysis Using Hierarchical Hidden Markov Model. In: Proceedings of the Second SIGHAN Workshop, pp. 63–70 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhou, Z., Zou, X., Lv, X., Hu, J. (2013). Research on Weighted Complex Network Based Keywords Extraction. In: Liu, P., Su, Q. (eds) Chinese Lexical Semantics. CLSW 2013. Lecture Notes in Computer Science(), vol 8229. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45185-0_47
Download citation
DOI: https://doi.org/10.1007/978-3-642-45185-0_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45184-3
Online ISBN: 978-3-642-45185-0
eBook Packages: Computer ScienceComputer Science (R0)