skip to main content
10.1145/3485447.3511943acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

QEN: Applicable Taxonomy Completion via Evaluating Full Taxonomic Relations

Published: 25 April 2022 Publication History

Abstract

Taxonomy is a fundamental type of knowledge graph for a wide range of web applications like searching and recommendation systems. To keep a taxonomy automatically updated with the latest concepts, the taxonomy completion task matches a pair of proper hypernym and hyponym in the original taxonomy with the new concept as its parent and child. Previous solutions utilize term embeddings as input and only evaluate the parent-child relations between the new concept and the hypernym-hyponym pair. Such methods ignore the important sibling relations, and are not applicable in reality since term embeddings are not available for the latest concepts. They also suffer from the relational noise of the “pseudo-leaf” node, which is a null node acting as a node’s hyponym to enable the new concept to be a leaf node. To tackle the above drawbacks, we propose the Quadruple Evaluation Network (QEN), a novel taxonomy completion framework that utilizes easily accessible term descriptions as input, and applies pretrained language model and code attention for accurate inference while reducing online computation. QEN evaluates both parent-child and sibling relations to both enhance the accuracy and reduce the noise brought by pseudo-leaf. Extensive experiments on three real-world datasets in different domains with different sizes and term description sources prove the effectiveness and robustness of QEN on overall performance and especially the performance for adding non-leaf nodes, which largely surpasses previous methods and achieves the new state-of-the-art of the task.1

References

[1]
Georgeta Bordea, Paul Buitelaar, Stefano Faralli, and Roberto Navigli. 2015. SemEval-2015 Task 17: Taxonomy Extraction Evaluation (TExEval). In Proceedings of the 9th International Workshop on Semantic Evaluation. Denver, Colorado, 902–910.
[2]
Anita Burgun and Olivier Bodenreider. 2001. Aspects of the Taxonomic Relation in the Biomedical Domain. In Proceedings of the International Conference on Formal Ontology in Information Systems-Volume. 222–233.
[3]
Anne Cocos, Marianna Apidianaki, and Chris Callison-Burch. 2018. Comparing Constraints for Taxonomic Organization. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). New Orleans, Louisiana, 323–333.
[4]
Sarthak Dash, Md. Faisal Mahbub Chowdhury, Alfio Gliozzo, Nandana Mihindukulasooriya, and Nicolas Rodolfo Fauceglia. 2020. Hypernym Detection Using Strict Partial Order Networks. In The Thirty-Fourth AAAI Conference on Artificial Intelligence. 7626–7633.
[5]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota, 4171–4186.
[6]
Jingyue Gao, Yuanduo He, Yasha Wang, Xiting Wang, Jiangtao Wang, Guangju Peng, and Xu Chu. 2019. STAR: Spatio-Temporal Taxonomy-Aware Tag Recommendation for Citizen Complaints. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. Beijing, China, 1903–1912.
[7]
Marti A. Hearst. 1992. Automatic Acquisition of Hyponyms from Large Text Corpora. In The 14th International Conference on Computational Linguistics.
[8]
Jin Huang, Zhaochun Ren, Wayne Xin Zhao, Gaole He, Ji-Rong Wen, and Daxiang Dong. 2019. Taxonomy-Aware Multi-Hop Reasoning Networks for Sequential Recommendation. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. Melbourne, VIC, Australia, 573–581.
[9]
Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, and Jason Weston. 2020. Poly-encoders: Architectures and Pre-Training Strategies for Fast and Accurate Multi-sentence Scoring. In 8th International Conference on Learning Representations. Addis Ababa, Ethiopia.
[10]
Meng Jiang, Jingbo Shang, Taylor Cassidy, Xiang Ren, Lance M. Kaplan, Timothy P. Hanratty, and Jiawei Han. 2017. MetaPAD: Meta Pattern Discovery from Massive Text Corpora. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Halifax, NS, Canada, 877–886.
[11]
David Jurgens and Mohammad Taher Pilehvar. 2016. SemEval-2016 Task 14: Semantic Taxonomy Enrichment. In Proceedings of the 10th International Workshop on Semantic Evaluation. San Diego, California, 1092–1102.
[12]
Giannis Karamanolakis, Jun Ma, and Xin Luna Dong. 2020. TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online, 8489–8502.
[13]
Dekang Lin. 1998. An Information-Theoretic Definition of Similarity. In Proceedings of the Fifteenth International Conference on Machine Learning. San Francisco, CA, USA, 296–304.
[14]
Carolyn E. Lipscomb. 2000. Medical Subject Headings (MeSH). Bulletin of the Medical Library Association 88, 3 (2000), 265–266.
[15]
Bang Liu, Weidong Guo, Di Niu, Jinwen Luo, Chaoyue Wang, Zhen Wen, and Yu Xu. 2020. GIANT: Scalable Creation of a Web-Scale Ontology. In Proceedings of the 2020 International Conference on Management of Data. 393–409.
[16]
Bang Liu, Weidong Guo, Di Niu, Chaoyue Wang, Shunnan Xu, Jinghong Lin, Kunfeng Lai, and Yu Xu. 2019. A User-Centered Concept Mining System for Query and Document Understanding at Tencent. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019. 1831–1841.
[17]
Xueqing Liu, Yangqiu Song, Shixia Liu, and Haixun Wang. 2012. Automatic Taxonomy Construction From Keywords. In The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Beijing, China, 1433–1441.
[18]
Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In 7th International Conference on Learning Representations. New Orleans, LA, USA.
[19]
Mingyu Derek Ma, Muhao Chen, Te-Lin Wu, and Nanyun Peng. 2021. HyperExpan: Taxonomy Expansion with Hyperbolic Representation Learning. ArXiv preprint abs/2109.10500.
[20]
Emaad Manzoor, Rui Li, Dhananjay Shrouty, and Jure Leskovec. 2020. Expanding Taxonomies with Implicit Edge Semantics. In The Web Conference 2020. Taipei, Taiwan, 2044–2054.
[21]
Yuning Mao, Xiang Ren, Jiaming Shen, Xiaotao Gu, and Jiawei Han. 2018. End-to-End Reinforcement Learning for Automatic Taxonomy Induction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Melbourne, Australia, 2462–2472.
[22]
Tomás Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems 26. 3111–3119.
[23]
George A. Miller. 1992. WordNet: A Lexical Database for English. In Speech and Natural Language: Proceedings of a Workshop. Harriman, New York, USA.
[24]
Stephen Roller, Douwe Kiela, and Maximilian Nickel. 2018. Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Melbourne, Australia, 358–363.
[25]
Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2020. DistilBERT, a Distilled Version of BERT: Smaller, Faster, Cheaper and Lighter. ArXiv preprint abs/1910.01108.
[26]
Jingbo Shang, Xinyang Zhang, Liyuan Liu, Sha Li, and Jiawei Han. 2020. NetTaxo: Automated Topic Taxonomy Construction from Text-Rich Network. In The Web Conference 2020. Taipei, Taiwan, 1908–1919.
[27]
Jiaming Shen, Zhihong Shen, Chenyan Xiong, Chi Wang, Kuansan Wang, and Jiawei Han. 2020. TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced Graph Neural Network. In The Web Conference 2020. Taipei, Taiwan, 486–497.
[28]
Jiaming Shen, Zeqiu Wu, Dongming Lei, Chao Zhang, Xiang Ren, Michelle T. Vanni, Brian M. Sadler, and Jiawei Han. 2018. HiExpan: Task-Guided Taxonomy Construction by Hierarchical Tree Expansion. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. London, UK, 2180–2189.
[29]
Rion Snow, Daniel Jurafsky, and Andrew Y. Ng. 2006. Semantic Taxonomy Induction from Heterogenous Evidence. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics. Sydney, Australia, 801–808.
[30]
Richard Socher, Danqi Chen, Christopher D. Manning, and Andrew Y. Ng. 2013. Reasoning With Neural Tensor Networks for Knowledge Base Completion. In Advances in Neural Information Processing Systems 26. 926–934.
[31]
Xiangchen Song, Jiaming Shen, Jieyu Zhang, and Jiawei Han. 2021. Who Should Go First? A Self-Supervised Concept Sorting Model for Improving Taxonomy Expansion. ArXiv preprint abs/2104.03682.
[32]
Ilya Sutskever, Ruslan Salakhutdinov, and Joshua B. Tenenbaum. 2009. Modelling Relational Data using Bayesian Clustered Tensor Factorization. In Advances in Neural Information Processing Systems 22. 1821–1828.
[33]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30. Long Beach, CA, USA, 5998–6008.
[34]
Chengyu Wang, Yan Fan, Xiaofeng He, and Aoying Zhou. 2019. A Family of Fuzzy Orthogonal Projection Models for Monolingual and Cross-lingual Hypernymy Prediction. In The World Wide Web Conference. San Francisco, CA, USA, 1965–1976.
[35]
Suyuchen Wang, Ruihui Zhao, Xi Chen, Yefeng Zheng, and Bang Liu. 2021. Enquire One’s Parent and Child Before Decision: Fully Exploit Hierarchical Structure for Self-Supervised Taxonomy Expansion. In Proceedings of the Web Conference 2021. New York, NY, USA, 3291–3304.
[36]
Wentao Wu, Hongsong Li, Haixun Wang, and Kenny Qili Zhu. 2012. Probase: a probabilistic taxonomy for text understanding. In Proceedings of the ACM SIGMOD International Conference on Management of Data. Scottsdale, AZ, USA, 481–492.
[37]
Wenpeng Yin and Dan Roth. 2018. Term Definitions Help Hypernymy Detection. In Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics. New Orleans, Louisiana, 203–213.
[38]
Yue Yu, Yinghao Li, Jiaming Shen, Hao Feng, Jimeng Sun, and Chao Zhang. 2020. STEAM: Self-Supervised Taxonomy Expansion with Mini-Paths. In The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Virtual Event, CA, USA, 1026–1035.
[39]
Qingkai Zeng, Jinfeng Lin, Wenhao Yu, Jane Cleland-Huang, and Meng Jiang. 2021. Enhancing Taxonomy Completion with Concept Generation via Fusing Relational Representations. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. New York, NY, USA, 2104–2113.
[40]
Chao Zhang, Fangbo Tao, Xiusi Chen, Jiaming Shen, Meng Jiang, Brian M. Sadler, Michelle Vanni, and Jiawei Han. 2018. TaxoGen: Unsupervised Topic Taxonomy Construction by Adaptive Term Embedding and Clustering. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. London, UK, 2701–2709.
[41]
Jieyu Zhang, Xiangchen Song, Ying Zeng, Jiaze Chen, Jiaming Shen, Yuning Mao, and Lei Li. 2021. Taxonomy Completion via Triplet Matching Network. Proceedings of the AAAI Conference on Artificial Intelligence 35, 5(2021), 4662–4670.
[42]
Yijia Zhang, Qingyu Chen, Zhihao Yang, Hongfei Lin, and Zhiyong Lu. 2019. BioWordVec, Improving Biomedical Word Embeddings With Subword Information and MeSH. Scientific Data 6, 1 (2019), 1–9.

Cited By

View all
  • (2024)Taxonomy Completion via Implicit Concept InsertionProceedings of the ACM Web Conference 202410.1145/3589334.3645584(2159-2169)Online publication date: 13-May-2024
  • (2024)A Language Model Based Framework for New Concept Placement in OntologiesThe Semantic Web10.1007/978-3-031-60626-7_5(79-99)Online publication date: 26-May-2024
  • (2023)Ontology Enrichment from Texts: A Biomedical Dataset for Concept Discovery and PlacementProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615126(5316-5320)Online publication date: 21-Oct-2023
  • Show More Cited By

Index Terms

  1. QEN: Applicable Taxonomy Completion via Evaluating Full Taxonomic Relations
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image ACM Conferences
          WWW '22: Proceedings of the ACM Web Conference 2022
          April 2022
          3764 pages
          ISBN:9781450390965
          DOI:10.1145/3485447
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Sponsors

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 25 April 2022

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. Self-supervised Learning
          2. Taxonomic Relations
          3. Taxonomy Completion

          Qualifiers

          • Research-article
          • Research
          • Refereed limited

          Funding Sources

          • NSERC

          Conference

          WWW '22
          Sponsor:
          WWW '22: The ACM Web Conference 2022
          April 25 - 29, 2022
          Virtual Event, Lyon, France

          Acceptance Rates

          Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)104
          • Downloads (Last 6 weeks)11
          Reflects downloads up to 08 Mar 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)Taxonomy Completion via Implicit Concept InsertionProceedings of the ACM Web Conference 202410.1145/3589334.3645584(2159-2169)Online publication date: 13-May-2024
          • (2024)A Language Model Based Framework for New Concept Placement in OntologiesThe Semantic Web10.1007/978-3-031-60626-7_5(79-99)Online publication date: 26-May-2024
          • (2023)Ontology Enrichment from Texts: A Biomedical Dataset for Concept Discovery and PlacementProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615126(5316-5320)Online publication date: 21-Oct-2023
          • (2023)A Single Vector Is Not Enough: Taxonomy Expansion via Box EmbeddingsProceedings of the ACM Web Conference 202310.1145/3543507.3583310(2467-2476)Online publication date: 30-Apr-2023

          View Options

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format.

          HTML Format

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media