Predicting hypernym–hyponym relations for Chinese taxonomy learning

Wang, Chengyu; Fan, Yan; He, Xiaofeng; Zhou, Aoying

doi:10.1007/s10115-018-1166-1

Predicting hypernym–hyponym relations for Chinese taxonomy learning

Regular Paper
Published: 10 February 2018

Volume 58, pages 585–610, (2019)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Chengyu Wang¹,
Yan Fan¹,
Xiaofeng He ORCID: orcid.org/0000-0002-6911-348X¹ &
…
Aoying Zhou²

893 Accesses
10 Citations
7 Altmetric
1 Mention
Explore all metrics

Abstract

Hypernym–hyponym (“is-a”) relations are key components in taxonomies, object hierarchies and knowledge graphs. Robustly harvesting of such relations requires the analysis of the linguistic characteristics of is-a word pairs in the target language. While there is abundant research on is-a relation extraction in English, it still remains a challenge to accurately identify such relations from Chinese knowledge sources due to the flexibility of language expression and the significant differences between the two language families. In this paper, we introduce a weakly supervised framework to extract Chinese is-a relations from user-generated categories. It employs piecewise linear projection models trained on an existing Chinese taxonomy built from Wikipedia and an iterative learning algorithm to update model parameters incrementally. A pattern-based relation selection method is proposed to prevent “semantic drift” in the learning process using bi-criteria optimization. Experimental results on the publicly available test set illustrate that the proposed approach outperforms state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hypernym Extraction: Combining Machine-Learning and Dependency Grammar

Relation Extraction: Hypernymy Discovery Using a Novel Pattern Learning Algorithm

Article 26 September 2023

Taxonomy Induction from Chinese Encyclopedias by Combinatorial Optimization

Notes

Baidu Baike (http://baike.baidu.com/) is one of the largest online encyclopedia websites in China. The example is taken from the online version Baidu Baike in June, 2016.
http://www.ltp-cloud.com/download/
In practice, there can be over two candidate hyponyms in “Such-As” and “Co-Hyponym” patterns. For simplicity, we only list two here, denoted as \(x_i\) and \(x_j\).
http://nlpchina.github.io/ansj_seg/.

References

Cai J, Utiyama M, Sumita E, Zhang Y (2014) Dependency-based pre-ordering for chinese-english machine translation. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, pp 155–160
Carlson A, Betteridge J, Kisiel B, Settles B, Hruschka ER., Jr., Mitchell, TM (2010) Toward an architecture for never-ending language learning. In: Proceedings of the twenty-fourth AAAI conference on artificial intelligence
Carlson A, Betteridge J, Wang RC, Hruschka Jr. ER, Mitchell TM (2010) Coupled semi-supervised learning for information extraction. In: Proceedings of the third international conference on web search and web data mining, pp 101–110
de Melo G, Weikum G (2014) Taxonomic data integration from multilingual wikipedia editions. Knowl Inf Syst 39(1):1–39
Article Google Scholar
Diaz F, Mitra B, Craswell N (2016) Query expansion with locally-trained word embeddings. In: Proceedings of the 54th annual meeting of the association for computational linguistics
Dong X, Gabrilovich E, Heitz G, Horn W, Lao N, Murphy K, Strohmann T, Sun S, Zhang W (2014) Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 601–610
Dong Z, Dong Q, Hao C (2010) Hownet and its computation of meaning. In: Proceedings of the 23rd International Conference on Computational Linguistics, Demonstrations Volume, pp 53–56
Etzioni O, Fader A, Christensen J, Soderland S, Mausam M (2011) Open information extraction: The second generation. In: Proceedings of the 22nd international joint conference on artificial intelligence, pp 3–10
Fu R, Guo J, Qin B, Che W, Wang H, Liu T (2014) Learning semantic hierarchies via word embeddings. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, pp 1199–1209
Fu R, Qin B, Liu T (2013) Exploiting multiple sources for open-domain hypernym discovery. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 1224–1234
Hearst MA (1992) Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th international conference on computational linguistics, pp 539–545
Hua W, Wang Z, Wang H, Zheng K, Zhou X (2015) Short text understanding through lexical-semantic analysis. In: 31st IEEE international conference on data engineering, pp 495–506
Khuller S, Moss A, Naor J (1999) The budgeted maximum coverage problem. Inf Process Lett 70(1):39–45
Article MathSciNet MATH Google Scholar
Kotlerman L, Dagan I, Szpektor I, Zhitomirsky-Geffet M (2010) Directional distributional similarity for lexical inference. Nat Lang Eng 16(4):359–389
Article Google Scholar
Lehmann J, Isele R, Jakob M, Jentzsch A, Kontokostas D, Mendes PN, Hellmann S, Morsey M, van Kleef P, Auer S, Bizer C (2015) Dbpedia—a large-scale, multilingual knowledge base extracted from wikipedia. Semant Web 6(2):167–195
Google Scholar
Lenci A, Benotto G (2012) Identifying hypernyms in distributional semantic spaces. In: Proceedings of the sixth international workshop on semantic evaluation, pp 543–546
Li H-G, Wu X, Li Z, Wu G (2013) A relation extraction method of chinese named entities based on location and semantic features. Appl Intell 38(1):1–15
Article MathSciNet Google Scholar
Li J, Wang C, He X, Zhang R, Gao M (2015) User generated content oriented chinese taxonomy construction. In: Web technologies and applications—17th Asia-Pacific web conference, pp 623–634
Li PP, Wang H, Zhu KQ, Wang Z, Wu X (2013) Computing term similarity by large probabilistic isa knowledge. In: Proceedings of 22nd ACM international conference on information and knowledge management, pp 1401–1410
Lin T, Mausam, Etzioni O (2012) No noun phrase left behind: Detecting and typing unlinkable entities. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning, pp 893–903
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781
Mikolov T, Yih W, Zweig G (2013) Geoffrey Linguistic regularities in continuous space word representations. In: Human language technologies: conference of the North American chapter of the association of computational linguistics, pp 746–751
Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41
Article Google Scholar
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 1532–1543
Ponzetto SP, Strube M (2007) Deriving a large-scale taxonomy from wikipedia. In: Proceedings of the twenty-second AAAI conference on artificial intelligence, pp 1440–1445
Snow R, Jurafsky D, Ng AY (2004) Learning syntactic patterns for automatic hypernym discovery. In: Advances in neural information processing systems 17, NIPS 2004, pp 1297–1304
Suchanek FM, Kasneci G, Weikum G (2007) Yago: a core of semantic knowledge. In: Proceedings of the 16th international conference on world wide web, pp 697–706
Tomás D, González JLV (2013) Minimally supervised question classification on fine-grained taxonomies. Knowl Inf Syst 36(2):303–334
Article Google Scholar
Wang C, Gao M, He X, Zhang R (2015) Challenges in chinese knowledge graph construction. In: 31st IEEE international conference on data engineering workshops, pp 59–61
Wang C, He X (2016) Chinese hypernym-hyponym extraction from user generated categories. In: Proceedings of the 26th international conference on computational linguistics, pp 1350–1361
Wang Z, Li J, Li S, Li M, Tang J, Zhang K, Zhang K (2014) Cross-lingual knowledge validation based taxonomy derivation from heterogeneous online wikis. In: Proceedings of the twenty-eighth AAAI conference on artificial intelligence, pp 180–186
Wong MK, Abidi SSR, Jonsen ID (2014) A multi-phase correlation search framework for mining non-taxonomic relations from unstructured text. Knowl Inf Syst 38(3):641–667
Article Google Scholar
Wu W, Li H, Wang H, Zhu KQ (2012) Probase: a probabilistic taxonomy for text understanding. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 481–492
Yang MC, Duan N, Zhou M, Rim HC (2014) Joint relational embeddings for knowledge-based question answering. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 645–650
Yu Z, Wang H, Lin X, Wang M (2015) Learning term embeddings for hypernymy identification. In: Proceedings of the twenty-fourth international joint conference on artificial intelligence, pp 1390–1397
Zhang J, Liu S, Li Mu, Zhou M, Zong C (2014) Bilingually-constrained phrase embeddings for machine translation. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, pp 111–121
Zhou G, Zhu Z, He T, Hu XT (2016) Cross-lingual sentiment classification with stacked autoencoders. Knowl Inf Syst 47(1):27–44
Article Google Scholar
Zhou H, Chen L, Shi F, Huang D (2015) Learning bilingual sentiment word embeddings for cross-language sentiment classification. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing of the asian federation of natural language processing, pp 430–440

Download references

Acknowledgements

We thank anonymous reviewers for their very useful comments and suggestions. This work is supported by the National Key Research and Development Program of China under Grant No. 2016YFB1000904. Chengyu Wang is partially supported by the ECNU Outstanding Doctoral Dissertation Cultivation Plan of Action under Grant No. YB2016040. This manuscript is an extended version of the paper “Chinese Hypernym-Hyponym Extraction from User Generated Categories” presented at COLING 2016 [30]. The Chinese taxonomy construction technique is based on our previous work, which was presented at APWeb 2015, entitled “User Generated Content Oriented Chinese Taxonomy Construction” [18].

Author information

Authors and Affiliations

School of Computer Science and Software Engineering, East China Normal University, Shanghai, 200062, China
Chengyu Wang, Yan Fan & Xiaofeng He
School of Data Science and Engineering, East China Normal University, Shanghai, 200062, China
Aoying Zhou

Authors

Chengyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Fan
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofeng He
View author publications
You can also search for this author in PubMed Google Scholar
Aoying Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaofeng He.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, C., Fan, Y., He, X. et al. Predicting hypernym–hyponym relations for Chinese taxonomy learning. Knowl Inf Syst 58, 585–610 (2019). https://doi.org/10.1007/s10115-018-1166-1

Download citation

Received: 27 December 2016
Revised: 13 November 2017
Accepted: 30 January 2018
Published: 10 February 2018
Issue Date: 05 March 2019
DOI: https://doi.org/10.1007/s10115-018-1166-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predicting hypernym–hyponym relations for Chinese taxonomy learning

Abstract

Access this article

Similar content being viewed by others

Hypernym Extraction: Combining Machine-Learning and Dependency Grammar

Relation Extraction: Hypernymy Discovery Using a Novel Pattern Learning Algorithm

Taxonomy Induction from Chinese Encyclopedias by Combinatorial Optimization

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Predicting hypernym–hyponym relations for Chinese taxonomy learning

Abstract

Access this article

Similar content being viewed by others

Hypernym Extraction: Combining Machine-Learning and Dependency Grammar

Relation Extraction: Hypernymy Discovery Using a Novel Pattern Learning Algorithm

Taxonomy Induction from Chinese Encyclopedias by Combinatorial Optimization

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation