Keyphrase Extraction from Chinese News Web Pages Based on Semantic Relations

Xie, Fei; Wu, Xindong; Hu, Xue-Gang; Wang, Fei-Yue

doi:10.1007/978-3-540-69304-8_51

Keyphrase Extraction from Chinese News Web Pages Based on Semantic Relations

Fei Xie^25,28,
Xindong Wu^25,26,
Xue-Gang Hu²⁵ &
…
Fei-Yue Wang²⁷

Conference paper

2304 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5075))

Abstract

Keyphrases are very useful for saving time on browsing through the news web pages. A new keyphrase extraction method from Chinese news web pages based on semantic relations is presented in this paper. Semantic relations between phrases are analyzed, and a lexical chain is used to construct a semantic relation graph. Keyphrases are extracted and a semantic link graph is built on the lexical chains. News web pages with core hints are selected from www.163.com to test our method. The experimental results show that the proposed method substantially outperforms the method based on term frequency, especially when the number of keyphrases extracted is 3 - the precision is improved by 26.97 percent, and the recall is improved by 20.93 percent.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Turney, P.D.: Learning to extract keyphrases from text. National Research Council, Canada, NRC Technical Report ERB-1057 (1999)
Google Scholar
Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Nevill-Manning, C.G.: KEA: Practical automatic keyphrase extraction. In: Proceedings of the 4th ACM conference On Digital Libraries, Berkeley, California, US, pp. 254–256 (1999)
Google Scholar
Mihalcea, R., Tarau, P.: Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL 2004) (companion volume), Barcelona, Spain (2004)
Google Scholar
Su-Jian, L., Hou-Feng, W., Shi-Wen, Y., Cheng-Sheng, X.: Research on maximum entropy model for keyword indexing. Chinese Journal of Computers 27(9), 1192–1197 (2004)
Google Scholar
Yuan-Chao, L., Xiao-Long, W., Zhi-Ming, X., Bing-Quan, L.: Ming constructing rules of Chinese keyphrase based on rough set theory. Acta Electronica Sinica 35(2), 371–374 (2007)
Google Scholar
Hong-Guang, S., Yu-Shu, L., Shu-Ying, C.: A keyword selection method based on lexical chains. Journal of Chinese Information Processing 20(6), 25–30 (2006)
Google Scholar
Qun, L., Su-Jian, L.: Word Similarity Computing Based on How-net. Computational Linguistics and Chinese Language Processing 7(2), 59–76 (2002)
Google Scholar
Zhen-Dong, D., Qiang, D.: HowNet, http://keenage.com.cn
Halliday, M.A.K., Hasan, R.: Cohesion in English. Longman, London (1976)
Google Scholar
Morris, J., Hirst, G.: Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Computational Linguistics 17(1), 21–48 (1991)
Google Scholar
Barzilay, R., Elhadad, M.: Using lexical chains for text summarization. In: Proceedings of the ACL/EACL 1997 Workshop on Intelligent Scalable Text Summarization, Madrid, Spain, July 1997, pp. 10–17 (1997)
Google Scholar
Peat, H.J., Willet, P.: The limitations of term co-occurrence data for query expansion in document retrieval systems. Journal of American Society for Information Science 42(5), 378–383 (1991)
Article Google Scholar
Chun, D.: On indexing of key words. Acta Editologica 16(2), 105–106 (2004)
Google Scholar
Salton, G., Wong, A., Yang, C.S.: On the specification of term values in automatic indexing. Journal of Documentation 29(4), 351–372 (1973)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Information Engineering, Hefei University of Technology, Heifei, 230009, China
Fei Xie, Xindong Wu & Xue-Gang Hu
Department of Computer Science, University of Vermont, Burlington, VT 50405, U.S.A.
Xindong Wu
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Fei-Yue Wang
Department of Computer Science and Technology, Hefei Teachers College, Hefei, 230061, China
Fei Xie

Authors

Fei Xie
View author publications
You can also search for this author in PubMed Google Scholar
Xindong Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xue-Gang Hu
View author publications
You can also search for this author in PubMed Google Scholar
Fei-Yue Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The Chinese University of Hong Kong, Hong Kong
Christopher C. Yang
The University of Arizona, USA
Hsinchun Chen
The University of Hong Kong, Hong Kong
Michael Chau
Nanyang Technological University, Singapore
Kuiyu Chang
University of Central Florida, USA
Sheau-Dong Lang
Tatung University, Taiwan
Patrick S. Chen
California University of Pennsylvania, USA
Raymond Hsieh
University of Arizona and Chinese Academy of Sciences, USA
Daniel Zeng
Chinese Academy of Sciences, China
Fei-Yue Wang & Wenji Mao &
Carnegie Mellon University, USA
Kathleen Carley & Justin Zhan &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xie, F., Wu, X., Hu, XG., Wang, FY. (2008). Keyphrase Extraction from Chinese News Web Pages Based on Semantic Relations. In: Yang, C.C., et al. Intelligence and Security Informatics. ISI 2008. Lecture Notes in Computer Science, vol 5075. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69304-8_51

Download citation

DOI: https://doi.org/10.1007/978-3-540-69304-8_51
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69136-5
Online ISBN: 978-3-540-69304-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics