Skip to main content
Log in

Dissertation Abstract:Learning High Precision Lexical Inferences

  • Dissertation and Habilitation Abstracts
  • Published:
KI - Künstliche Intelligenz Aims and scope Submit manuscript

Abstract

The fundamental goal of natural language processing is to build models capable of human-level understanding of natural language. One of the obstacles to building such models is lexical variability, i.e. the ability to express the same meaning in various ways. Existing text representations excel at capturing relatedness (e.g. blue/red), but they lack the fine-grained distinction of the specific semantic relation between a pair of words. This article is a summary of a Ph.D. dissertation submitted to Bar-Ilan University in 2019, under the supervision of Professor Ido Dagan of the Computer Science Department. The dissertation explored methods for recognizing and extracting semantic relationships between concepts (cat is a type of animal), the constituents of noun compounds (baby oil is oil for babies), and verbal phrases (‘X died at Y’ means the same as ‘X lived until Y’ in certain contexts). The proposed models outperform highly competitive baselines and improve the state-of-the-art in several benchmarks. The dissertation concludes in discussing two challenges in the way of human-level language understanding: developing more accurate text representations and learning to read between the lines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Attia M, Maharjan S, Samih Y, Kallmeyer L, Solorio T (2016) CogALex-V shared task: GHHH - detecting semantic relations via word embeddings. In: Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon (CogALex - V), The COLING 2016 Organizing Committee, Osaka, Japan, pp 86–91. https://www.aclweb.org/anthology/W16-5311

  2. Baroni M, Lenci A (2011) How we BLESSed distributional semantic evaluation. In: Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics, Association for Computational Linguistics, Edinburgh, UK, pp 1–10. https://www.aclweb.org/anthology/W11-2501

  3. Baroni M, Bernardi R, Do NQ, Shan Cc (2012) Entailment above the word level in distributional semantics. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, Avignon, France, pp 23–32. https://www.aclweb.org/anthology/E12-1004

  4. Barzilay R, McKeown KR (2001) Extracting paraphrases from a parallel corpus. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Toulouse, France, pp 50–57, 10.3115/1073012.1073020. https://www.aclweb.org/anthology/P01-1008

  5. Van de Cruys T, Afantenos S, Muller P (2013) MELODI: A supervised distributional approach for free paraphrasing of noun compounds. In: Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), Association for Computational Linguistics, Atlanta, Georgia, USA, pp 144–147. https://www.aclweb.org/anthology/S13-2026

  6. Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota, pp 4171–4186. 10.18653/v1/N19-1423, https://www.aclweb.org/anthology/N19-1423

  7. Dima C (2016) On the compositionality and semantic interpretation of English noun compounds. In: Proceedings of the 1st Workshop on Representation Learning for NLP, Association for Computational Linguistics, Berlin, Germany, pp 27–39. 10.18653/v1/W16-1604, https://www.aclweb.org/anthology/W16-1604

  8. Ganitkevitch J, Van Durme B, Callison-Burch C (2013) PPDB: The paraphrase database. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, Atlanta, Georgia, pp 758–764. https://www.aclweb.org/anthology/N13-1092

  9. Harris ZS (1954) Distributional structure. Word

  10. Hearst MA (1992) Automatic acquisition of hyponyms from large text corpora. In: COLING 1992 Volume 2: The 15th International Conference on Computational Linguistics, https://www.aclweb.org/anthology/C92-2082

  11. Hendrickx I, Kozareva Z, Nakov P, Ó Séaghdha D, Szpakowicz S, Veale T (2013) SemEval-2013 task 4: Free paraphrases of noun compounds. In: Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), Association for Computational Linguistics, Atlanta, Georgia, USA, pp 138–143. https://www.aclweb.org/anthology/S13-2025

  12. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  13. Kassner N, Schütze H (2020) Negated and misprimed probes for pretrained language models: Birds can talk, but cannot fly. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, pp 7811–7818. 10.18653/v1/2020.acl-main.698, https://www.aclweb.org/anthology/2020.acl-main.698

  14. Levy O, Remus S, Biemann C, Dagan I (2015) Do supervised distributional methods really learn lexical inference relations? In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, Denver, Colorado, pp 970–976. 10.3115/v1/N15-1098, https://www.aclweb.org/anthology/N15-1098

  15. Lin D, Pantel P (2001) Dirt – Discovery of inference rules from text. In: Proceedings of the seventh international conference on Knowledge discovery and data mining (ACM SIGKDD), ACM, pp 323–328

  16. Logan R, Liu NF, Peters ME, Gardner M, Singh S (2019) Barack’s wife hillary: Using knowledge graphs for fact-aware language modeling. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, pp 5962–5971. 10.18653/v1/P19-1598, https://www.aclweb.org/anthology/P19-1598

  17. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst (NIPS) 26:3111–3119

    Google Scholar 

  18. Nakov P (2013) On the interpretation of noun compounds: Syntax, semantics, and entailment. Nat Lang Eng 19(03):291–330

    Article  Google Scholar 

  19. Nakov P, Hearst M (2006) Using verbs to characterize noun-noun relations. In: International conference on artificial intelligence: Methodology, systems, and applications (AIMSA), Springer, pp 233–244

  20. Necşulescu S, Mendes S, Jurgens D, Bel N, Navigli R (2015) Reading between the lines: Overcoming data sparsity for accurate classification of lexical relationships. In: Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics, Association for Computational Linguistics, Denver, Colorado, pp 182–192. 10.18653/v1/S15-1021, https://www.aclweb.org/anthology/S15-1021

  21. Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), Association for Computational Linguistics, New Orleans, Louisiana, pp 2227–2237. 10.18653/v1/N18-1202, https://www.aclweb.org/anthology/N18-1202

  22. Roth M, Frank A (2012) Aligning predicate argument structures in monolingual comparable texts: A new corpus for a new task. In: *SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), Association for Computational Linguistics, Montréal, Canada, pp 218–227. https://www.aclweb.org/anthology/S12-1030

  23. Santus E, Yung F, Lenci A, Huang CR (2015) EVALution 1.0: an evolving semantic dataset for training and evaluation of distributional semantic models. In: Proceedings of the 4th Workshop on Linked Data in Linguistics: Resources and Applications, Association for Computational Linguistics, Beijing, China, pp 64–69. 10.18653/v1/W15-4208, https://www.aclweb.org/anthology/W15-4208

  24. Santus E, Gladkova A, Evert S, Lenci A (2016a) The CogALex-V shared task on the corpus-based identification of semantic relations. In: Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon (CogALex - V), The COLING 2016 Organizing Committee, Osaka, Japan, pp 69–79. https://www.aclweb.org/anthology/W16-5309

  25. Santus E, Lenci A, Chiu TS, Lu Q, Huang CR (2016b) Nine features in a random forest to learn taxonomical semantic relations. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), European Language Resources Association (ELRA), Portorož, Slovenia, pp 4557–4564. https://www.aclweb.org/anthology/L16-1722

  26. Shinyama Y, Sekine S (2006) Preemptive information extraction using unrestricted relation discovery. In: Proceedings of the Human Language Technology Conference of the NAACL, Main Conference, Association for Computational Linguistics, New York City, USA, pp 304–311. https://www.aclweb.org/anthology/N06-1039

  27. Shinyama Y, Sekine S, Sudo K (2002) Automatic paraphrase acquisition from news articles. In: Proceedings of the second international conference on Human Language Technology Research (HLT), Morgan Kaufmann Publishers Inc., pp 313–318

  28. Shwartz V, Dagan I (2016a) CogALex-V shared task: LexNET - integrated path-based and distributional method for the identification of semantic relations. In: Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon (CogALex - V), The COLING 2016 Organizing Committee, Osaka, Japan, pp 80–85. https://www.aclweb.org/anthology/W16-5310

  29. Shwartz V, Dagan I (2016b) Path-based vs. distributional information in recognizing lexical semantic relations. In: Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon (CogALex - V), The COLING 2016 Organizing Committee, Osaka, Japan, pp 24–29. https://www.aclweb.org/anthology/W16-5304

  30. Shwartz V, Dagan I (2018) Paraphrase to explicate: Revealing implicit noun-compound relations. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Melbourne, Australia, pp 1200–1211, 10.18653/v1/P18-1111. https://www.aclweb.org/anthology/P18-1111

  31. Shwartz V, Dagan I (2019) Still a pain in the neck: Evaluating text representations on lexical composition. Transactions of the Association for Computational Linguistics 7:403–419

  32. Shwartz V, Waterson C (2018) Olive oil is made of olives, baby oil is made for babies: Interpreting noun compounds using paraphrases in a neural model. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), Association for Computational Linguistics, New Orleans, Louisiana, pp 218–224. 10.18653/v1/N18-2035, https://www.aclweb.org/anthology/N18-2035

  33. Shwartz V, Goldberg Y, Dagan I (2016) Improving hypernymy detection with an integrated path-based and distributional method. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Berlin, Germany, pp 2389–2398. 10.18653/v1/P16-1226, https://www.aclweb.org/anthology/P16-1226

  34. Shwartz V, Stanovsky G, Dagan I (2017) Acquiring predicate paraphrases from news tweets. In: Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017), Association for Computational Linguistics, Vancouver, Canada, pp 155–160. 10.18653/v1/S17-1019, https://www.aclweb.org/anthology/S17-1019

  35. Snow R, Jurafsky D, Ng A (2004) Learning syntactic patterns for automatic hypernym discovery. Adv Neural Inf Process Syst (NIPS) 17:1297–1304

    Google Scholar 

  36. Surtani N, Batra A, Ghosh U, Paul S (2013) IIIT-H: A corpus-driven co-occurrence based probabilistic model for noun compound paraphrasing. In: Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), Association for Computational Linguistics, Atlanta, Georgia, USA, pp 153–157. https://www.aclweb.org/anthology/S13-2028

  37. Versley Y (2013) SFS-TUE: Compound paraphrasing with a language model and discriminative reranking. In: Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), Association for Computational Linguistics, Atlanta, Georgia, USA, pp 148–152. https://www.aclweb.org/anthology/S13-2027

  38. Weeds J, Clarke D, Reffin J, Weir D, Keller B (2014) Learning to distinguish hypernyms and co-hyponyms. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, Dublin City University and Association for Computational Linguistics, Dublin, Ireland, pp 2249–2259. https://www.aclweb.org/anthology/C14-1212

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vered Shwartz.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shwartz, V. Dissertation Abstract:Learning High Precision Lexical Inferences. Künstl Intell 35, 377–383 (2021). https://doi.org/10.1007/s13218-021-00709-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13218-021-00709-7

Keywords

Navigation