Abstract
Semantic relatedness computation is a well known problem with multidisciplinary applications. Existing approaches to computing semantic relatedness ignore the asymmetric associations of words. In the absence of an explicit topical context, these asymmetric associations can be effectively used to represent the relation of words in directional contexts. Motivated by the idea of word associations, this paper presents a new approach to computing semantic relatedness using asymmetric association based probabilities of words extracted from the directional contexts of words based on the Wikipedia corpus. The performance evaluation of the proposed approach on a variety of publicly available benchmark datasets shows that the asymmetric association based measures outperformed not only the baseline symmetric measures but also most of the state-of-art approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sahlgren, M.: Vector-based semantic analysis: Representing word meanings based on random labels. In: Proceedings of ESSLI Workshop on Semantic Knowledge Acquistion and Categorization. Kluwer Academic Publishers (2001)
Landauer, T.K., Dumais, S.T.: A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 211–240 (1997)
Islam, A., Inkpen, D.: Second order co-occurrence pmi for determining the semantic similarity of words. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC 2006), pp. 1033–1038 (2006)
Liu, H., Bao, H., Xu, D.: Concept vector for semantic similarity and relatedness based on wordnet structure. Journal of Systems and Softwares 85, 370–381 (2012)
Jabeen, S., Gao, X., Andreae, P.: Directional Context Helps: Guiding Semantic Relatedness Computation by Asymmetric Word Associations. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds.) WISE 2013, Part I. LNCS, vol. 8180, pp. 92–101. Springer, Heidelberg (2013)
Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. Comput. Linguist. 16, 22–29 (1990)
Turney, P.D.: Mining the web for synonyms: Pmi-ir versus lsa on toefl. In: Proceedings of the 12th European Conference on Machine Learning, EMCL 2001, pp. 491–502 (2001)
Higgins, D.: Which statistics reflect semantics? rethinking synonymy and word similarity. In: Proceedings of International Conference on Linguistic Evidence, pp. 265–284 (2004)
Lin, D.: An information-theoretic definition of similarity. In: Proceedings of 15th International Conference on Machine Learning (ICML1998), pp. 296–304 (1998)
Bollegala, D., Matsuo, Y., Ishizuka, M.: A web search engine-based approach to measure semantic similarity between words. IEEE Trans. on Knowl. and Data Eng. 23(7), 977–990 (2011)
Cilibrasi, R.L., Vitanyi, P.M.B.: The google similarity distance. IEEE Trans. on Knowl. and Data Eng. 19(3), 370–383 (2007)
Gracia, J.L., Mena, E.: Web-based measure of semantic relatedness. In: Bailey, J., Maier, D., Schewe, K.-D., Thalheim, B., Wang, X.S. (eds.) WISE 2008. LNCS, vol. 5175, pp. 136–150. Springer, Heidelberg (2008)
Halawi, G., Dror, G., Gabrilovich, E., Koren, Y.: Large-scale learning of word relatedness with constraints. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012, pp. 1406–1414 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Jabeen, S., Gao, X., Andreae, P. (2014). Probabilistic Associations as a Proxy for Semantic Relatedness. In: Benatallah, B., Bestavros, A., Manolopoulos, Y., Vakali, A., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2014. WISE 2014. Lecture Notes in Computer Science, vol 8786. Springer, Cham. https://doi.org/10.1007/978-3-319-11749-2_38
Download citation
DOI: https://doi.org/10.1007/978-3-319-11749-2_38
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11748-5
Online ISBN: 978-3-319-11749-2
eBook Packages: Computer ScienceComputer Science (R0)