Exploiting Description Knowledge for Keyphrase Extraction

Wang, Fang; Wang, Zhongyuan; Wang, Senzhang; Li, Zhoujun

doi:10.1007/978-3-319-13560-1_11

Fang Wang²¹,
Zhongyuan Wang²²,
Senzhang Wang²¹ &
…
Zhoujun Li²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8862))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

6339 Accesses
11 Citations

Abstract

Keyphrase extraction is essential for many IR and NLP tasks. Existing methods usually use the phrases of the document separately without distinguishing the potential semantic correlations among them, or other statistical features from knowledge bases such as WordNet and Wikipedia. However, the mutual semantic information between phrases is also important, and exploiting their correlations may potentially help us more effectively extract the keyphrases. Generally, phrases in the title are more likely to be keyphrases reflecting the document topics, and phrases in the body are usually used to describe the document topics. We regard the relation between the title phrase and body phrase as a description relation. To this end, this paper proposes a novel keyphrase extraction approach by exploiting massive description relations. To make use of the semantic information provided by the description relations, we organize the phrases of a document as a description graph, and employ various graph-based ranking algorithms to rank the candidates. Experimental results on the real dataset demonstrate the effectiveness of the proposed approach in keyphrase extraction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Wan, X., Xiao, J.: Exploiting neighborhood knowledge for single document summarization and keyphrase extraction. ACM Transactions on Information Systems (TOIS) 28(2), 8 (2010)
Article Google Scholar
Liu, Z., Li, P., Zheng, Y., Sun, M.: Clustering to find exemplar terms for keyphrase extraction. In: EMNLP, pp. 257–266. ACL (2009)
Google Scholar
Frank, E., Paynter, G.W., Witten, I.H., Gutwin, C., Nevill-Manning, C.G.: Domain-specific keyphrase extraction. In: IJCAI, pp. 668–673 (1999)
Google Scholar
Jones, S., Staveley, M.S.: Phrasier: a system for interactive document retrieval using keyphrases. In: SIGIR, pp. 160–167. ACM (1999)
Google Scholar
Medelyan, O., Witten, I.H.: Thesaurus based automatic keyphrase indexing. In: JCDL, pp. 296–297. ACM (2006)
Google Scholar
Song, M., Song, I.Y., Allen, R.B., Obradovic, Z.: Keyphrase extraction-based query expansion in digital libraries. In: JCDL, pp. 202–209. ACM (2006)
Google Scholar
Salton, G., McGill, M.J.: Introduction to modern information retrieval (1986)
Google Scholar
Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Nevill-Manning, C.G.: Kea: Practical automatic keyphrase extraction. In: Conference on Digital Libraries, vol. 3, pp. 147–151. ACM (1999)
Google Scholar
Medelyan, O., Frank, E., Witten, I.H.: Human-competitive tagging using automatic keyphrase extraction. In: EMNLP, pp. 1318–1327. ACL (2009)
Google Scholar
Grineva, M., Grinev, M., Lizorkin, D.: Extracting key terms from noisy and multitheme documents. In: WWW, pp. 661–670. ACM (2009)
Google Scholar
Mahdi, A.E., Joorabchi, A.: A citation-based approach to automatic topical indexing of scientific literature. Journal of Information Science 36(6), 798–811 (2010)
Article Google Scholar
Joorabchi, A., Mahdi, A.E.: Automatic keyphrase annotation of scientific documents using wikipedia and genetic algorithms. Journal of Information Science 39(3), 410–426 (2013)
Article Google Scholar
Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: CIKM, pp. 233–242. ACM (2007)
Google Scholar
Yeh, E., Ramage, D., Manning, C.D., Agirre, E., Soroa, A.: Wikiwalk: random walks on wikipedia for semantic relatedness. In: TextGraphs Workshop, pp. 41–49. ACL (2009)
Google Scholar
Milne, D.: Computing semantic relatedness using wikipedia link structure. In: Proceedings of the New Zealand Computer Science Research Student Conference. Citeseer (2007)
Google Scholar
Fogarolli, A.: Word sense disambiguation based on wikipedia link structure. In: ICSC 2009, pp. 77–82. IEEE (2009)
Google Scholar
Milne, D., Witten, I.H.: An open-source toolkit for mining wikipedia. Artificial Intelligence 194, 222–239 (2013)
Article MathSciNet Google Scholar
Huang, C., Tian, Y., Zhou, Z., Ling, C.X., Huang, T.: Keyphrase extraction using semantic networks structure analysis. In: ICDM, pp. 275–284. IEEE (2006)
Google Scholar
Zhang, W., Feng, W., Wang, J.: Integrating semantic relatedness and words’ intrinsic features for keyword extraction. In: IJCAI, pp. 2225–2231. AAAI (2013)
Google Scholar
Lahiri, S., Choudhury, S.R., Caragea, C.: Keyword and keyphrase extraction using centrality measures on collocation networks. arXiv (2014)
Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web (1999)
Google Scholar
Jeh, G., Widom, J.: Simrank: a measure of structural-context similarity. In: KDD, pp. 538–543. ACM (2002)
Google Scholar
Wang, S., Xie, S., Zhang, X., Li, Z., Yu, P.S., Shu, X.: Future influence ranking of scientific literature. In: SDM, pp. 749–757. SIAM (2014)
Google Scholar
Jiang, X., Hu, Y., Li, H.: A ranking approach to keyphrase extraction. In: SIGIR, pp. 756–757. ACM (2009)
Google Scholar
Medelyan, O.: Human-competitive automatic topic indexing. PhD thesis, The University of Waikato (2009)
Google Scholar
Rolling, L.: Indexing consistency, quality and efficiency. Information Processing & Management 17(2), 69–76 (1981)
Article Google Scholar
Hasan, K.S., Ng, V.: Conundrums in unsupervised keyphrase extraction: making sense of the state-of-the-art. In: ICCL Posters, pp. 365–373. ACL (2010)
Google Scholar
Kim, S.N., Medelyan, O., Kan, M.-Y., Baldwin, T.: Semeval-2010 task 5: Automatic keyphrase extraction from scientific articles. In: ACL Workshop, pp. 21–26. ACL (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory of Software Development Environment, Beihang University, Beijing, China
Fang Wang, Senzhang Wang & Zhoujun Li
School of Information, Renmin University of China, Beijing, China
Zhongyuan Wang

Authors

Fang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhongyuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Senzhang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhoujun Li
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

MIMOS Berhad Technology Park Malaysia, 57000, Bukit Jalil, KL, Malaysia
Duc-Nghia Pham
Kyungpook National University, Sankyuk-Dong, Buk-Gu, 702-701, Daegu, Korea
Seong-Bae Park

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, F., Wang, Z., Wang, S., Li, Z. (2014). Exploiting Description Knowledge for Keyphrase Extraction. In: Pham, DN., Park, SB. (eds) PRICAI 2014: Trends in Artificial Intelligence. PRICAI 2014. Lecture Notes in Computer Science(), vol 8862. Springer, Cham. https://doi.org/10.1007/978-3-319-13560-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-13560-1_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13559-5
Online ISBN: 978-3-319-13560-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics