A multi-projection recurrent model for hypernym detection and discovery

Zhang, Xuefeng; Chen, Junfan; Luo, Zheyan; Bai, Yuhang; Hu, Chunming; Zhang, Richong

doi:10.1007/s11704-024-3638-7

A multi-projection recurrent model for hypernym detection and discovery

Research Article
Published: 22 November 2024

Volume 19, article number 194312, (2025)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Xuefeng Zhang¹,
Junfan Chen²,
Zheyan Luo¹,
Yuhang Bai¹,
Chunming Hu^2,3 &
…
Richong Zhang^1,3

43 Accesses
Explore all metrics

Abstract

Hypernym detection and discovery are fundamental tasks in natural language processing. The former task aims to identify all possible hypernyms of a given hyponym term, whereas the latter attempts to determine whether the given two terms hold a hypernymy relation or not. Existing research on hypernym detection and discovery tasks projects a term into various semantic spaces with single mapping functions. Despite their success, these methods may not be adequate in capturing complex semantic relevance between hyponym/hypernymy pairs in two aspects. First, they may fall short in modeling the hierarchical structure in the hypernymy relations, which may help them learn better term representations. Second, the polysemy phenomenon that hypernyms may express distinct senses is understudied. In this paper, we propose a Multi-Projection Recurrent model (MPR) to simultaneously capture the hierarchical relationships between terms and deal with diverse senses caused by the polysemy phenomenon. Specifically, we build a multi-projection mapping block to deal with the polysemy phenomenon, which learns various word senses by multiple projections. Besides, we adopt a hierarchy-aware recurrent block with the recurrent operation followed by a multi-hop aggregation module to capture the hierarchical structure of hypernym relations. Experiments on 11 benchmark datasets in various task settings illustrate that our multi-projection recurrent model outperforms the baselines. The experimental analysis and case study demonstrate that our multi-projection module and the recurrent structure are effective for hypernym detection and discovery tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predicting hypernym–hyponym relations for Chinese taxonomy learning

Article 10 February 2018

Hypert: hypernymy-aware BERT with Hearst pattern exploitation for hypernym discovery

Article Open access 12 September 2023

Hypernymy Detection for Vietnamese Using Dynamic Weighting Neural Network

References

Snow R, Jurafsky D, Ng A Y. Semantic taxonomy induction from heterogenous evidence. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics. 2006, 801–808
Google Scholar
Navigli R, Velardi P, Faralli S. A graph-based algorithm for inducing lexical taxonomies from scratch. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence. 2011, 1872–1877
Google Scholar
Qiang J, Zhang F, Li Y, Yuan Y, Zhu Y, Wu X. Unsupervised statistical text simplification using pre-trained language modeling for initialization. Frontiers of Computer Science, 2023, 17(1): 171303
Article Google Scholar
Qiang J, Li Y, Li Y, Yuan Y, Zhu Y. Lexical simplification via singleword generation. Frontiers of Computer Science, 2023, 17(6): 176347
Article Google Scholar
Yahya M, Berberich K, Elbassuoni S, Weikum G. Robust question answering over the web of linked data. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. 2013, 1107–1116
Chapter Google Scholar
Gupta D, Pujari R, Ekbal A, Bhattacharyya P, Maitra A, Jain T, Sengupta S. Can taxonomy help? improving semantic question matching using question taxonomy. In: Proceedings of the 27th International Conference on Computational Linguistics. 2018, 499–513
Google Scholar
Zhang M, He T, Dong M. Meta-path reasoning of knowledge graph for commonsense question answering. Frontiers of Computer Science, 2024, 18(1): 181303
Article Google Scholar
Hearst M A. Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th Conference on Computational Linguistics. 1992: 539–545
Chapter Google Scholar
Geffet M, Dagan I. The distributional inclusion hypotheses and lexical entailment. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. 2005, 107–114
Google Scholar
Weeds J, Weir D, McCarthy D. Characterising measures of lexical distributional similarity. In: Proceedings of the 20th International Conference on Computational Linguistics. 2004, 1015–es
Google Scholar
Roller S, Kiela D, Nickel M. Hearst patterns revisited: Automatic hypernym detection from large text corpora. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018, 358–363
Google Scholar
Dash S, Chowdhury M F M, Gliozzo A, Mihindukulasooriya N, Fauceglia N R. Hypernym detection using strict partial order networks. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020, 7626–7633
Google Scholar
Camacho-Collados J, Bovi C D, Espinosa-Anke L, Oramas S, Pasini T, Santus E, Shwartz V, Navigli R, Saggion H. SemEval-2018 task 9: Hypernym discovery. In: Proceedings of the 12th International Workshop on Semantic Evaluation. 2018, 712–724
Chapter Google Scholar
Seitner J, Bizer C, Eckert K, Faralli S, Meusel R, Paulheim H, Ponzetto S P. A large database of hypernymy relations extracted from the web. In: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). 2016, 360–367
Google Scholar
Snow R, Jurafsky D, Ng A Y. Learning syntactic patterns for automatic hypernym discovery. In: Proceedings of the 17th International Conference on Neural Information Processing Systems. 2004, 1297–1304
Google Scholar
Weeds J, Weir D. A general framework for distributional similarity. In: Proceedings of 2003 Conference on Empirical Methods in Natural Language Processing. 2003, 81–88
Chapter Google Scholar
Kotlerman L, Dagan I, Szpektor I, Zhitomirsky-Geffet M. Directional distributional similarity for lexical inference. Natural Language Engineering, 2010, 16(S4): 359–389
Article Google Scholar
Lenci A, Benotto G. Identifying hypernyms in distributional semantic spaces. In: Proceedings of *SEM 2012: The 1st Joint Conference on Lexical and Computational Semantics-Volume 1: Proceedings of the Main Conference and the Shared Task, and Volume 2: Proceedings of the 6th International Workshop on Semantic Evaluation (SemEval 2012). 2012, 75–79
Google Scholar
Santus E, Lenci A, Lu Q, Walde S S I. Chasing hypernyms in vector spaces with entropy. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, Volume 2: Short Papers. 2014, 38–42
Google Scholar
Roller S, Erk K, Boleda G. Inclusive yet selective: supervised distributional hypernymy detection. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: Technical papers. 2014: 1025–1036.
Google Scholar
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013, 3111–3119
Google Scholar
Fu R, Guo J, Qin B, Che W, Wang H, Liu T. Learning semantic hierarchies via word embeddings. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2014, 1199–1209
Chapter Google Scholar
Le M, Roller S, Papaxanthos L, Kiela D, Nickel M. Inferring concept hierarchies from text corpora via hyperbolic embeddings. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 3231–3241
Chapter Google Scholar
Nickel M, Kiela D. Poincaré embeddings for learning hierarchical representations. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6341–6350
Google Scholar
Ustalov D, Arefyev N, Biemann C, Panchenko A. Negative sampling improves hypernymy extraction based on projection learning. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers. 2017, 543–550
Google Scholar
Yamane J, Takatani T, Yamada H, Miwa M, Sasaki Y. Distributional hypernym generation by jointly learning clusters and projections. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. 2016, 1871–1879
Google Scholar
Bernier-Colborne G, Barrière C. CRIM at SemEval-2018 task 9: a hybrid approach to hypernym discovery. In: Proceedings of the 12th International Workshop on Semantic Evaluation. 2018, 725–731
Chapter Google Scholar
Held W, Habash N. The effectiveness of simple hybrid systems for hypernym discovery. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 3362–3367
Chapter Google Scholar
Yu C, Han J, Wang P, Song Y, Zhang H, Ng W, Shi S. When Hearst is not enough: Improving hypernymy detection from corpus with distributional models. In: Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020, 6208–6217
Chapter Google Scholar
Shi H, Xie M, Huang S. Robust AUC maximization for classification with pairwise confidence comparisons. Frontiers of Computer Science, 2024, 18(4): 184317
Article Google Scholar
Yang M, Liu Q, Sun X, Shi N, Xue H. Towards kernelizing the classifier for hyperbolic data. Frontiers of Computer Science, 2024, 18(1): 181301
Article Google Scholar
Ba J L, Kiros J R, Hinton G E. Layer normalization. 2016, arXiv preprint arXiv: 1607.06450
Levy O, Remus S, Biemann C, Dagan I. Do supervised distributional methods really learn lexical inference relations?. In: Proceedings of 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2015, 970–976
Google Scholar
Baroni M, Lenci A. How we BLESSed distributional semantic evaluation. In: Proceedings of the GEMS 2011 Workshop on Geometrical Models of Natural Language Semantics. 2011, 1–10
Google Scholar
Baroni M, Bernardi R, Do N Q, Shan C C. Entailment above the word level in distributional semantics. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. 2012, 23–32
Google Scholar
Santus E, Yung F, Lenci A, Huang C R. EVALution 1.0: an evolving semantic dataset for training and evaluation of distributional semantic models. In: Proceedings of the 4th Workshop on Linked Data in Linguistics: Resources and Applications. 2015, 64–69
Chapter Google Scholar
Shwartz V, Goldberg Y, Dagan I. Improving hypernymy detection with an integrated path-based and distributional method. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2016, 2389–2398
Chapter Google Scholar
Weeds J, Clarke D, Reffin J, Weir D, Keller B. Learning to distinguish hypernyms and co-hyponyms. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 2014, 2249–2259
Google Scholar
Espinosa-Anke L, Camacho-Collados J, Bovi C D, Saggion H. Supervised distributional hypernym discovery via domain adaptation. In: Proceedings of 2016 Conference on Empirical Methods in Natural Language Processing. 2016, 424–435
Chapter Google Scholar
Berend G, Makrai M, Földiák P. 300-sparsans at SemEval-2018 task 9: Hypernymy as interaction of sparse attributes. In: Proceedings of the 12th International Workshop on Semantic Evaluation. 2018, 928–934
Chapter Google Scholar
Qiu W, Chen M, Li L, Si L. NLP_HZ at SemEval-2018 task 9: a nearest neighbor approach. In: Proceedings of the 12th International Workshop on Semantic Evaluation. 2018, 909–913
Chapter Google Scholar
Kingma D P, Ba J. Adam: A method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations. 2015
Google Scholar
He K, Zhang X, Ren S, Sun J. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In: Proceedings of 2015 IEEE International Conference on Computer Vision. 2015, 1026–1034
Google Scholar
Pennington J, Socher R, Manning C. GloVe: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014, 1532–1543
Chapter Google Scholar

Download references

Acknowledgements

This work was supported by the National Science and Technology Major Project of China (2022ZD0120202), and the Natural Natural Science Foundation of China (Grant No. U23B2056). Thanks for the computing infrastructure provided by Beijing Advanced Innovation Center for Big Data and Brain Computing.

Author information

Authors and Affiliations

School of Computer Science and Engineering, Beihang University, Beijing, 100191, China
Xuefeng Zhang, Zheyan Luo, Yuhang Bai & Richong Zhang
School of Software, Beihang University, Beijing, 100191, China
Junfan Chen & Chunming Hu
Zhongguancun Laboratory, Beijing, 100094, China
Chunming Hu & Richong Zhang

Authors

Xuefeng Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Junfan Chen
View author publications
You can also search for this author inPubMed Google Scholar
Zheyan Luo
View author publications
You can also search for this author inPubMed Google Scholar
Yuhang Bai
View author publications
You can also search for this author inPubMed Google Scholar
Chunming Hu
View author publications
You can also search for this author inPubMed Google Scholar
Richong Zhang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Richong Zhang.

Ethics declarations

Competing interests The authors declare that they have no competing interests or financial conflicts to disclose.

Additional information

Xuefeng Zhang is currently a PhD student at the School of Computer Science and Engineering, Beihang University, China. He received a BSc degree in Computer Science and Technology from Sichuan University, China in 2019. His research interests include knowledge engineering and natural language processing.

Junfan Chen received a BSc degree in Business English from Beijing Technology and Business University, China in 2015, and a PhD degree from the School of Computer Science and Engineering, Beihang University, China in 2022. He is currently a postdoctor at the School of Software, Beihang University, China. His research interests include natural language processing and knowledge engineering.

Zheyan Luo received a BS degree in Computer Science from Shanghai University, China in 2021 and a Master’s degree in Software Engineering from Beihang University, China in 2024. His current research interests include natural language processing, transfer learning, and large language models.

Yuhang Bai received a BSc degree from Jilin University, China in 2020, and a Master’s degree at the School of Computer Science, Beihang University, China in 2023. His research interests include lexical semantics and relation extraction.

Chunming HU received a PhD degree from Beihang University, China in 2006. He is a professor at the School of Software, Beihang University, China. His research interests include distributed systems, system virtualization, large-scale data management, and processing systems.

Richong Zhang received his BSc and MASc degrees from Jilin University, China in 2001 and 2004, respectively. In 2006, he received his MSc degree from Dalhousie University, Canada. In 2011, he received his PhD from the School of Information Technology and Engineering, University of Ottawa, Canada. He is currently a professor at the School of Computer Science and Engineering, Beihang University, China. His research interests include natural language processing and knowledge engineering.

Electronic supplementary material