Skip to main content

Discovering Representative Space for Relational Similarity Measurement

  • Conference paper
  • First Online:
Book cover Computational Linguistics (PACLING 2017)

Abstract

Relational similarity measures the correspondence of the semantic relations that exist between the two words in word pairs. Accurately measuring relational similarity is important for various natural language processing tasks such as, relational search, noun-modifier classification, and analogy detection. Despite this need, the features that accurately express the relational similarity between two word pairs remain largely unknown. So far, methods have been proposed based on linguistic intuitions such as the functional space proposed by Turney [1], which consists purely of verbs. In contrast, we propose a data-driven approach for discovering feature spaces for relational similarity measurement. Specifically, we use a linear-SVM classifier to select features using training instances, where two pairs of words are labeled as analogous or non-analogous. We evaluate the discovered feature space by measuring the relational similarity for relational classification task in which we aim to classify a given word-pair to a specific relation from a predefined set of relations. Linear classifier for ranking the best feature for relational space has been compared with different methods namely, Kullback Leibler divergence (KL), Pointwise Mutual Information (PMI). Experimental results show that our proposed classification method accurately discovers a discriminative features for measuring relational similarity. Furthermore, experiments show that the proposed method requires small number of relational features while still maintaining reasonable relational similarity accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/ivri/DiffVec.

  2. 2.

    The corpus was collected by Charles Clarke at the University of Waterloo.

  3. 3.

    http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html.

  4. 4.

    http://www.nltk.org.

References

  1. Turney, P.D.: Domain and function: a dual-space model of semantic relations and compositions. J. Aritif. Intell. Res. 44, 533–585 (2012)

    MATH  Google Scholar 

  2. Bollegala, D., Matsuo, Y., Ishizuka, M.: A relational model of semantic similarity between words using automatically extracted lexical pattern clusters from the web. In: Proceedings of the Empirical Methods in Natural Language Processing, pp. 803–812 (2009)

    Google Scholar 

  3. Turney, P.D.: A uniform approach to analogies, synonyms, antonyms, and associations. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pp. 905–912 (2008)

    Google Scholar 

  4. Nakov, P., Kozareva, Z.: Combining relational and attributional similarity for semantic relation classification. In: Proceedings of the Recent Advances in Natural Language Processing, pp. 323–330 (2011)

    Google Scholar 

  5. Duc, N.T., Bollegala, D., Ishizuka, M.: Using relational similarity between word pairs for latent relational search on the web. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp. 196–199 (2010)

    Google Scholar 

  6. Riedel, S., Yao, L., McCallum, A., Marlin, B.M.: Relation extraction with matrix factorization and universal schemas. In: Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 74–84 (2013)

    Google Scholar 

  7. Turney, P.D.: Similarity of semantic relations. Comput. Linguist. 32(3), 379–416 (2006)

    Article  MATH  Google Scholar 

  8. Turney, P.D.: Distributional semantics beyond words: supervised learning of analogy and paraphrase. Trans. Assoc. Computat. Linguist. 1, 353–366 (2013)

    Google Scholar 

  9. Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Process. 25(2–3), 259–284 (1998)

    Article  Google Scholar 

  10. Turney, P.D.: Measuring semantic similarity by latent relational analysis. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 1136–1141 (2005). arXiv preprint arXiv:cs/0508053

  11. Tripathi, G., Naganna, S.: Feature selection and classification approach for sentiment analysis. Mach. Learn. Appl. Int. J. 2(2), 1–16 (2015)

    Google Scholar 

  12. Brank, J., Grobelnik, M., Milic-Frayling, N., Mladenic, D.: Feature selection using support vector machines. WIT Trans. Inf. Commun. Technol. 28 (2002)

    Google Scholar 

  13. Mladenić, D., Brank, J., Grobelnik, M., Milic-Frayling, N.: Feature selection using linear classifier weights: interaction with classification models. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 234–241. ACM (2004)

    Google Scholar 

  14. Ji, Y., Eisenstein, J.: Discriminative improvements to distributional sentence similarity. In: Proceedings of the Empirical Methods in Natural Language Processing, pp. 891–896 (2013)

    Google Scholar 

  15. Xu, Y., Jones, G.J., Li, J., Wang, B., Sun, C.: A study on mutual information-based feature selection for text categorization. J. Comput. Inf. Syst. 3(3), 1007–1012 (2007)

    Google Scholar 

  16. Schneider, K.-M.: Weighted average pointwise mutual information for feature selection in text categorization. In: Jorge, A.M., Torgo, L., Brazdil, P., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 252–263. Springer, Heidelberg (2005). https://doi.org/10.1007/11564126_27

    Chapter  Google Scholar 

  17. Vylomova, E., Rimmel, L., Cohn, T., Baldwin, T.: Take and took, gaggle and goose, book and read: evaluating the utility of vector differences for lexical relation learning. In: Proceedings of the Association for Computational Linguistics, pp. 1671–1682 (2016)

    Google Scholar 

  18. Turney, P.D., Neuman, Y., Assaf, D., Cohen, Y.: Literal and metaphorical sense identification through concrete and abstract context. In: Proceedings of the Empirical Methods in Natural Language Processing, pp. 27–31 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huda Hakami .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hakami, H., Mandya, A., Bollegala, D. (2018). Discovering Representative Space for Relational Similarity Measurement. In: Hasida, K., Pa, W. (eds) Computational Linguistics. PACLING 2017. Communications in Computer and Information Science, vol 781. Springer, Singapore. https://doi.org/10.1007/978-981-10-8438-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-8438-6_7

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-8437-9

  • Online ISBN: 978-981-10-8438-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics