Discovering Representative Space for Relational Similarity Measurement

Hakami, Huda; Mandya, Angrosh; Bollegala, Danushka

doi:10.1007/978-981-10-8438-6_7

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 781))

Included in the following conference series:

International Conference of the Pacific Association for Computational Linguistics

835 Accesses

Abstract

Relational similarity measures the correspondence of the semantic relations that exist between the two words in word pairs. Accurately measuring relational similarity is important for various natural language processing tasks such as, relational search, noun-modifier classification, and analogy detection. Despite this need, the features that accurately express the relational similarity between two word pairs remain largely unknown. So far, methods have been proposed based on linguistic intuitions such as the functional space proposed by Turney [1], which consists purely of verbs. In contrast, we propose a data-driven approach for discovering feature spaces for relational similarity measurement. Specifically, we use a linear-SVM classifier to select features using training instances, where two pairs of words are labeled as analogous or non-analogous. We evaluate the discovered feature space by measuring the relational similarity for relational classification task in which we aim to classify a given word-pair to a specific relation from a predefined set of relations. Linear classifier for ranking the best feature for relational space has been compared with different methods namely, Kullback Leibler divergence (KL), Pointwise Mutual Information (PMI). Experimental results show that our proposed classification method accurately discovers a discriminative features for measuring relational similarity. Furthermore, experiments show that the proposed method requires small number of relational features while still maintaining reasonable relational similarity accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/ivri/DiffVec.
2.
The corpus was collected by Charles Clarke at the University of Waterloo.
3.
http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html.
4.
http://www.nltk.org.

References

Turney, P.D.: Domain and function: a dual-space model of semantic relations and compositions. J. Aritif. Intell. Res. 44, 533–585 (2012)
MATH Google Scholar
Bollegala, D., Matsuo, Y., Ishizuka, M.: A relational model of semantic similarity between words using automatically extracted lexical pattern clusters from the web. In: Proceedings of the Empirical Methods in Natural Language Processing, pp. 803–812 (2009)
Google Scholar
Turney, P.D.: A uniform approach to analogies, synonyms, antonyms, and associations. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pp. 905–912 (2008)
Google Scholar
Nakov, P., Kozareva, Z.: Combining relational and attributional similarity for semantic relation classification. In: Proceedings of the Recent Advances in Natural Language Processing, pp. 323–330 (2011)
Google Scholar
Duc, N.T., Bollegala, D., Ishizuka, M.: Using relational similarity between word pairs for latent relational search on the web. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp. 196–199 (2010)
Google Scholar
Riedel, S., Yao, L., McCallum, A., Marlin, B.M.: Relation extraction with matrix factorization and universal schemas. In: Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 74–84 (2013)
Google Scholar
Turney, P.D.: Similarity of semantic relations. Comput. Linguist. 32(3), 379–416 (2006)
Article MATH Google Scholar
Turney, P.D.: Distributional semantics beyond words: supervised learning of analogy and paraphrase. Trans. Assoc. Computat. Linguist. 1, 353–366 (2013)
Google Scholar
Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Process. 25(2–3), 259–284 (1998)
Article Google Scholar
Turney, P.D.: Measuring semantic similarity by latent relational analysis. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 1136–1141 (2005). arXiv preprint arXiv:cs/0508053
Tripathi, G., Naganna, S.: Feature selection and classification approach for sentiment analysis. Mach. Learn. Appl. Int. J. 2(2), 1–16 (2015)
Google Scholar
Brank, J., Grobelnik, M., Milic-Frayling, N., Mladenic, D.: Feature selection using support vector machines. WIT Trans. Inf. Commun. Technol. 28 (2002)
Google Scholar
Mladenić, D., Brank, J., Grobelnik, M., Milic-Frayling, N.: Feature selection using linear classifier weights: interaction with classification models. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 234–241. ACM (2004)
Google Scholar
Ji, Y., Eisenstein, J.: Discriminative improvements to distributional sentence similarity. In: Proceedings of the Empirical Methods in Natural Language Processing, pp. 891–896 (2013)
Google Scholar
Xu, Y., Jones, G.J., Li, J., Wang, B., Sun, C.: A study on mutual information-based feature selection for text categorization. J. Comput. Inf. Syst. 3(3), 1007–1012 (2007)
Google Scholar
Schneider, K.-M.: Weighted average pointwise mutual information for feature selection in text categorization. In: Jorge, A.M., Torgo, L., Brazdil, P., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 252–263. Springer, Heidelberg (2005). https://doi.org/10.1007/11564126_27
Chapter Google Scholar
Vylomova, E., Rimmel, L., Cohn, T., Baldwin, T.: Take and took, gaggle and goose, book and read: evaluating the utility of vector differences for lexical relation learning. In: Proceedings of the Association for Computational Linguistics, pp. 1671–1682 (2016)
Google Scholar
Turney, P.D., Neuman, Y., Assaf, D., Cohen, Y.: Literal and metaphorical sense identification through concrete and abstract context. In: Proceedings of the Empirical Methods in Natural Language Processing, pp. 27–31 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, University of Liverpool, Liverpool, UK
Huda Hakami, Angrosh Mandya & Danushka Bollegala

Authors

Huda Hakami
View author publications
You can also search for this author in PubMed Google Scholar
Angrosh Mandya
View author publications
You can also search for this author in PubMed Google Scholar
Danushka Bollegala
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huda Hakami .

Editor information

Editors and Affiliations

Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan
Kôiti Hasida
Natural Language Processing Lab, University of Computer Studies, Yangon, Yangon, Myanmar
Win Pa Pa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hakami, H., Mandya, A., Bollegala, D. (2018). Discovering Representative Space for Relational Similarity Measurement. In: Hasida, K., Pa, W. (eds) Computational Linguistics. PACLING 2017. Communications in Computer and Information Science, vol 781. Springer, Singapore. https://doi.org/10.1007/978-981-10-8438-6_7

Download citation

DOI: https://doi.org/10.1007/978-981-10-8438-6_7
Published: 04 March 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8437-9
Online ISBN: 978-981-10-8438-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics