Abstract
Semantic relatedness computation is the task of quantifying the degree of relatedness of two concepts. The performance of existing approaches to computing semantic relatedness is highly dependent on particular aspects of relatedness. For instance, taxonomy-based approaches aim at computing similarity, which is a special case of semantic relatedness. On the other hand, corpus-based approaches focus on the associative relations of words by taking their distributional features into account. Based on the assumption that different aspects of knowledge sources cover different kinds of semantic relations, this paper presents a hybrid model for computing semantic relatedness of words using new features extracted from various aspects of Wikipedia. The focus of this paper is on finding the optimal feature combination(s) that enhance the performance of the hybrid model. The empirical evaluation on benchmark datasets has shown that hybrid features perform better than single features by providing a complementary coverage of semantic relations, leading to improved correlation with human judgments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1999), pp. 50–57 (1999)
Patwardhan, S., Banerjee, S., Pedersen, T.: Using measures of semantic relatedness for word sense disambiguation. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 241–257. Springer, Heidelberg (2003)
Schonhofen, P.: Identifying document topics using the wikipedia category network. In: Proceedings of the International Conference on Web Intelligence (WI 2006), pp. 456–462. IEEE Computer Society (2006)
Huang, A., Milne, D., Frank, E., Witten, I.H.: Clustering documents using a wikipedia-based concept representation. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 628–636. Springer, Heidelberg (2009)
Yih, W., Qazvinian, V.: Measuring word relatedness using heterogeneous vector space models. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2012), pp. 616–620 (2012)
Budanitsky, A., Hirst, G.: Evaluating wordnet-based measures of lexical semantic relatedness. Comput. Linguist. 32, 13–47 (2006)
Agirre, E., Alfonseca, E., Hall, K., Kravalova, J., Paşca, M., Soroa, A.: A study on similarity and relatedness using distributional and wordnet-based approaches. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2009), pp. 19–27 (2009)
Milne, D., Witten, I.H.: An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In: Proceeding of AAAI Workshop on Wikipedia and Artificial Intelligence: an Evolving Synergy, pp. 25–30 (2008)
Navigli, R., Ponzetto, S.P.: Babelrelate! a joint multilingual approach to computing semantic relatedness. In: Proceedings of the Twenty-Sixth Conference on Artificial Intelligence, AAAI 2012 (2012)
Yazdani, M., Popescu-Belis, A.: Computing text semantic relatedness using the contents and links of a hypertext encyclopedia. Artif. Intell. 194, 176–202 (2013)
Bollegala, D., Matsuo, Y., Ishizuka, M.: A web search engine-based approach to measure semantic similarity between words. IEEE Trans. on Knowl. and Data Eng. 23(7), 977–990 (2011)
Sahami, M., Heilman, T.D.: A web-based kernel function for measuring the similarity of short text snippets. In: Proceedings of the 15th International Conference on World Wide Web (WWW 2006), pp. 377–386 (2006)
Hassan, S., Banea, C., Mihalcea, R.: Measuring semantic relatedness using multilingual representations. In: Proceedings of the First Joint Conference on Lexical and Computational Semantics (SemEval 2012), pp. 20–29 (2012)
Jarmasz, M., Szpakowicz, S.: Roget’s thesaurus: a lexical resource to treasure. CoRR (2012)
Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI 2007), pp. 1606–1611 (2007)
Ponzetto, S.P., Strube, M.: Knowledge derived from wikipedia for computing semantic relatedness. J. Artif. Intell. Res. (JAIR) 30, 181–212 (2007)
Mihalcea, R., Corley, C., Strapparava, C.: Corpus-based and knowledge-based measures of text semantic similarity. In: Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI 2006), pp. 775–780 (2006)
Milne, D., Witten, I.H.: An open-source toolkit for mining wikipedia. Artificial Intelligence 194, 222–239 (2013); Artificial Intelligence, Wikipedia and Semi-Structured Resources.
Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management (CIKM 2007), pp. 233–242 (2007)
Jabeen, S., Gao, X., Andreae, P.: Directional Context Helps: Guiding Semantic Relatedness Computation by Asymmetric Word Associations. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds.) WISE 2013, Part I. LNCS, vol. 8180, pp. 92–101. Springer, Heidelberg (2013)
Landauer, T.K., Dumais, S.T.: A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 211–240 (1997)
Han, L., Finin, T., McNamee, P., Joshi, A., Yesha, Y.: Improving word similarity by augmenting pmi with estimates of word polysemy. IEEE Trans. Knowl. Data Eng. 25(6), 1307–1322 (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Jabeen, S., Gao, X., Andreae, P. (2014). A Hybrid Model for Learning Semantic Relatedness Using Wikipedia-Based Features. In: Benatallah, B., Bestavros, A., Manolopoulos, Y., Vakali, A., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2014. WISE 2014. Lecture Notes in Computer Science, vol 8786. Springer, Cham. https://doi.org/10.1007/978-3-319-11749-2_39
Download citation
DOI: https://doi.org/10.1007/978-3-319-11749-2_39
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11748-5
Online ISBN: 978-3-319-11749-2
eBook Packages: Computer ScienceComputer Science (R0)