Abstract
This paper presents an unsupervised graph-based method to extract keyphrases using semantic information. The proposed method has two stages. In the first one, we have extracted MFS (Maximal Frequent Sequences) and built the nodes of a graph with them. The weight of the connection between two nodes has been established according to common statistical information and semantic relatedness. In the second stage, we have ranked MFS with traditionally PageRank algorithm; but we have included ConceptNet. This external resource adds an extra weight value between two MFS. The experimental results are competitive with traditional approaches developed in this area. MFSRank overcomes the baseline for top 5 keyphrases in precision, recall and F-score measures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Jianga, X., Hub, Y., Lib, H.: A ranking Approach to Keyphrase Extraction. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, pp. 756–757 (2009)
Gelbukh, A., Sidorov, G., Guzmán-Arenas, A.: Use of a Weighted Topic Hierarchy for Document Classification. In: Matoušek, V., Mautner, P., Ocelíková, J., Sojka, P. (eds.) TSD 1999. LNCS (LNAI), vol. 1692, pp. 133–138. Springer, Heidelberg (1999)
Ledo Mezquita, Y., Sidorov, G., Gelbukh, A.: Tool for Computer-Aided Spanish Word Sense Disambiguation. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 277–280. Springer, Heidelberg (2003)
Gelbukh, A., Sidorov, G., Galicia Haro, S., Bolshakov, I.: Environment for Development of a Natural Language Syntactic Analyzer. Acta Academia 2002, 206–213 (2002)
Kim, S.N., Medelyan, O., Kan, M.Y., Baldwin, T.: SemEval-2010 task 5: Automatic keyphrase extraction from scientific articles. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 21–26 (2010)
Xiaojun, W., Jianguo, X.: Single document keyphrase extraction using neighborhood knowledge. In: Proceedings of the 23rd National Conference on Artificial Intelligence, vol. 2, pp. 855–860 (2008)
Rada, M., Paul, T.: TextRank: Bringing order into texts. In: Conference on Empirical Methods in Natural Language Processing, pp. 404–411 (2004)
Xiaojun, W., Jianwu, Y., Jianguo, X.: Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 552–559 (2007)
Kazi, S.H., Vincent, N.: Conundrums in unsupervised keyphrase extraction: making sense of the state-of-the-art. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 365–373 (2010)
Roberto, O., David, P., Mireya, T., Héctor, J.: BUAP: An unsupervised approach to automatic keyphrase extraction from scientific articles. In: Proceedings of the 5th International Workshop on Semantic Evaluation (SemEval 2010), pp. 174–177 (2010)
Page, L., Brin, S., Motwani, R., Winograd, T.: The Pagerank Citation Ranking: Bringing Order to the Web. Technical report, Stanford Digital Libraries (1998)
Sandra, G., Roxana, D., Paolo, R.: Drug-Drug Interaction Detection: A New Approach Based on Maximal Frequent Sequences. Procesamientto de Lenguje Natural 45 (2010)
Helena, A.M.: Discovery of Frequent Word Sequences in Text. In: Proceedings of the ESF Exploratory Workshop on Pattern Detection and Discovery, pp. 180–189 (2002)
Liu, H., Singh, P.: ConceptNet: A Practical Commonsense Reasoning Tool-Kit. BT Technology Journal 22 (2004)
Liu, H., Singh, P.: Commonsense Reasoning in and Over Natural Language. In: Negoita, M.G., Howlett, R.J., Jain, L.C. (eds.) KES 2004. LNCS (LNAI), vol. 3215, pp. 293–306. Springer, Heidelberg (2004)
Ledeneva, Y., Gelbukh, A., García-Hernández, R.: Keeping Maximal Frequent Sequences Facilitates Extractive Summarization. In: Sidorov, G., et al. (eds.) Advances in Computer Science and Engineering, 9th Conference on Computing (CORE 2008), Research in Computing Science, vol. 34, pp. 163–174 (2008)
Ian, H.W., Gordon, W.P., Eibe, F., Carl, G., Craig, G.: KEA: Practical automatic keyphrase extraction. In: Proceedings of the Fourth ACM Conference on Digital Libraries (DL 1999), pp. 254–255. ACM (1999)
Chong, H., Yonghong, T., Zhi, Z., Charles, X.L., Tiejun, H.: Keyphrase extraction using semantic networks structure analysis. In: Proc. of the ICDM 2006, pp. 275–284 (2006)
Peter, D.: Learning Algorithms for Keyphrase Extraction. Inf. Retr. 2(4), 303–336 (2006)
Porter, M.F.: An Algorithm for Suffix Stripping. Program 14(3), 130–137 (1980)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
López, R.E., Barreda, D., Tejada, J., Cuadros, E. (2011). MFSRank: An Unsupervised Method to Extract Keyphrases Using Semantic Information. In: Batyrshin, I., Sidorov, G. (eds) Advances in Artificial Intelligence. MICAI 2011. Lecture Notes in Computer Science(), vol 7094. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25324-9_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-25324-9_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25323-2
Online ISBN: 978-3-642-25324-9
eBook Packages: Computer ScienceComputer Science (R0)