An unsupervised keyphrase extraction model by incorporating structural and semantic information

Luo, Linkai; Zhang, Longmin; Peng, Hong

doi:10.1007/s13748-019-00200-3

An unsupervised keyphrase extraction model by incorporating structural and semantic information

Regular Paper
Published: 26 October 2019

Volume 9, pages 77–83, (2020)
Cite this article

Progress in Artificial Intelligence Aims and scope Submit manuscript

320 Accesses
3 Citations
Explore all metrics

Abstract

We proposed an unsupervised keyphrase extraction model that incorporates the structural information and the semantic information of a document. The structural information refers to the directed graph that is composed of keyphrase candidates and topics. The weight between two candidates is computed by their relative distance in the document and the positions of the corresponding sentences. Graph ranking algorithm is then applied to get the structural scores of the candidates. Then, the semantic score is obtained by the similarity between candidate and all sentences. The final score of a candidate is the sum of the structural score and the semantic score. The top N candidates with the highest scores are selected as the recommended keyphrases. The comparison experiments on three widely used datasets show that our model achieves the best results in the long documents and a competitive result in the short document. It indicates that our model is effective and is superior to the state-of-the-art unsupervised models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Two-Level Keyphrase Extraction Approach

Keyphrase Extraction Using Knowledge Graphs

Article Open access 16 November 2017

Keyphrase Extraction Using Knowledge Graphs

References

Florescu, C., Caragea, C.: Positionrank: an unsupervised approach to keyphrase extraction from scholarly documents. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 1105–1115 (2017)
Turney, P.D.: Learning algorithms for keyphrase extraction. Inf. Retr. 2(4), 303–336 (2000)
Article Google Scholar
Mihalcea, R., Tarau, P.: Textrank: bringing order into text. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (2004)
Hasan, K.S., Ng, V.: Automatic keyphrase extraction: a survey of the state of the art. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 1262–1273 (2014)
Boudin, F.: Unsupervised keyphrase extraction with multipartite graphs. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 2, pp. 667–672 (2018)
Wan, X., Xiao, J.: Single document keyphrase extraction using neighborhood knowledge. AAAI 8, 855–860 (2008)
Google Scholar
Bougouin, A., Boudin, F., Daille, B.: Topicrank: graph-based topic ranking for keyphrase extraction. In: International Joint Conference on Natural Language Processing, pp. 543–551 (2013)
Raganato, A., Camacho-Collados, J., Navigli, R.: Word sense disambiguation: a unified evaluation framework and empirical comparison. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, vol. 1, pp. 99–110 (2017)
Luo, F., Liu, T., Xia, Q., Chang, B., Sui, Z.: Incorporating glosses into neural word sense disambiguation. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 2473–2482 (2018)
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Article Google Scholar
Krapivin, M., Autaeu, A., Marchese, M.: Large dataset for keyphrases extraction. University of Trento, Trento (2009)
Google Scholar
Kim, S.N., Medelyan, O., Kan, M.Y., Baldwin, T.: Semeval-2010 task 5: automatic keyphrase extraction from scientific articles. In: Proceedings of the 5th International Workshop on Semantic Evaluation. Association for Computational Linguistics, pp. 21–26 (2010)
Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pp. 216–223 (2003)
Wan, X., Xiao, J.: CollabRank: towards a collaborative approach to single-document keyphrase extraction. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pp. 969–976 (2008)
Campos, R., Mangaravite, V., Pasquali, A., Jorge, A.M., Nunes, C., Jatowt, A.: YAKE! Collection-independent automatic keyword extractor. In: European Conference on Information Retrieval, pp. 806–810 (2018)
Google Scholar
Boudin, F.: pke: an open source python-based keyphrase extraction toolkit. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations, pp. 69–73 (2016)

Download references

Acknowledgements

We thank Weiqiang Liu for his work in the revision of English expression during the manuscript’s major revision and response during the minor revision.

Author information

Authors and Affiliations

Department of Automation, Xiamen University, Xiamen, People’s Republic of China
Linkai Luo, Longmin Zhang & Hong Peng

Authors

Linkai Luo
View author publications
You can also search for this author in PubMed Google Scholar
Longmin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hong Peng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hong Peng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Luo, L., Zhang, L. & Peng, H. An unsupervised keyphrase extraction model by incorporating structural and semantic information. Prog Artif Intell 9, 77–83 (2020). https://doi.org/10.1007/s13748-019-00200-3

Download citation

Received: 30 March 2019
Accepted: 10 October 2019
Published: 26 October 2019
Issue Date: March 2020
DOI: https://doi.org/10.1007/s13748-019-00200-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An unsupervised keyphrase extraction model by incorporating structural and semantic information

Abstract

Access this article

Similar content being viewed by others

A Two-Level Keyphrase Extraction Approach

Keyphrase Extraction Using Knowledge Graphs

Keyphrase Extraction Using Knowledge Graphs

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An unsupervised keyphrase extraction model by incorporating structural and semantic information

Abstract

Access this article

Similar content being viewed by others

A Two-Level Keyphrase Extraction Approach

Keyphrase Extraction Using Knowledge Graphs

Keyphrase Extraction Using Knowledge Graphs

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation