Skip to main content

Exploring Word-Sememe Graph-Centric Chinese Antonym Detection

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases: Research Track (ECML PKDD 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14171))

  • 815 Accesses

Abstract

Antonym detection is a vital task in NLP systems. Pattern-based methods, typical solutions for this, recognize semantic relationships between words using given patterns but have limited performance. Distributed word embeddings often struggle to distinguish antonyms from synonyms because their representations rely on local co-occurrences in similar contexts. Combining the ambiguity of Chinese and the contradictory nature of antonyms, antonym detection faces unique challenges. In this paper, we propose a word-sememe graph to integrate relationships between sememes and Chinese words, organized as a 4-partite graph. We design a heuristic sememe relevance computation as a supplementary measure and develop a relation inference scheme using related sememes as taxonomic information to leverage the relational transitivity. The 4-partite graph can be extended based on this scheme. We introduce the R elation D iscriminated L earning based on S ememe A ttention (RDLSA) model, employing three attention strategies on sememes to learn flexible entity representations. Antonym relations are detected using a Link Prediction approach with these embeddings. Our method demonstrates superior performance in Triple Classification and Chinese Antonym Detection compared to the baselines. Experimental results show reduced ambiguity and improved antonym detection using linguistic sememes. A quantitative ablation analysis further confirms our scheme’s effectiveness in capturing antonyms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Code and data: https://github.com/CGCL-codes/RDLSA.

  2. 2.

     https://github.com/thunlp/OpenHowNet

  3. 3.

    https://en.wikipedia.org/wiki/Jaro-Winkler_distance

  4. 4.

    https://github.com/chatopera/Synonyms

  5. 5.

    https://dict.baidu.com/

  6. 6.

    https://zh.m.wiktionary.org/wiki/

  7. 7.

    https://www.zdic.net/

  8. 8.

    http://www.nlpir.org/wordpress/download/tc-corpus-answer.rar

  9. 9.

    http://thuctc.thunlp.org/

  10. 10.

    https://github.com/aceimnorstuvwxz/TouTiao-text-classfication-dataset

  11. 11.

    https://www.heywhale.com/

  12. 12.

    https://drive.google.com/drive/folders/1qpLb-52DDEiYDmvpnlYgVfKbRlf428Mq?usp=sharing

References

  1. Ali, M.A., Sun, Y., Zhou, X., Wang, W., Zhao, X.: Antonym-synonym classification based on new sub-space embeddings. In: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, pp. 6204–6211 (2019)

    Google Scholar 

  2. Bastos, A., Singh, K., Nadgeri, A., Shekarpour, S., Mulang, I.O., Hoffart, J.: Hopfe: knowledge graph representation learning using inverse hopf fibrations. In: Proceedings of the 30th ACM International Conference on Information and Knowledge Management, Virtual Event, CIKM 2021, Queensland, Australia, 1–5 November, pp. 89–99. ACM (2021)

    Google Scholar 

  3. Bordes, A., Usunier, N., García-Durán, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems 2013, NIPS 2013, pp. 2787–2795 (2013)

    Google Scholar 

  4. Chen, X., Xu, L., Liu, Z., Sun, M., Luan, H.: Joint learning of character and word embeddings. In: Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, pp. 1236–1242. AAAI Press (2015)

    Google Scholar 

  5. Chen, Z., Feng, Y., Zhao, D.: Entailment graph learning with textual entailment and soft transitivity. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022, pp. 5899–5910 (2022)

    Google Scholar 

  6. Dong, Z., Dong, Q.: Hownet-a hybrid language and knowledge resource. In: Proceedings of the 2003 International Conference on Natural Language Processing and Knowledge Engineering, pp. 820–824 (2003)

    Google Scholar 

  7. Dou, Z., Wei, W., Wan, X.: Improving word embeddings for antonym detection using thesauri and sentiwordnet. In: Proceedings of Natural Language Processing and Chinese Computing - 7th CCF International Conference, NLPCC 2018, vol. 11109, pp. 67–79 (2018)

    Google Scholar 

  8. Etcheverry, M., Wonsever, D.: Unraveling antonym’s word vectors through a siamese-like network. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, pp. 3297–3307 (2019)

    Google Scholar 

  9. Gao, D., Wei, F., Li, W., Liu, X., Zhou, M.: Cross-lingual sentiment lexicon learning with bilingual word graph label propagation. Comput. Linguist. 41(1), 21–40 (2015)

    Article  Google Scholar 

  10. Han, X., et al.: Openke: an open toolkit for knowledge embedding. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018, pp. 139–144 (2018)

    Google Scholar 

  11. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  12. Ji, G., He, S., Xu, L., Liu, K., Zhao, J.: Knowledge graph embedding via dynamic mapping matrix. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, pp. 687–696 (2015)

    Google Scholar 

  13. Jin, H., Zhang, Z., Yuan, P.: Improving Chinese word representation using four corners features. IEEE Trans. Big Data 8(4), 982–993 (2022)

    Article  Google Scholar 

  14. Li, C., Ma, T.: Classification of Chinese word semantic relations. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Yu. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 465–473. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_39

    Chapter  Google Scholar 

  15. Lin, D., Zhao, S., Qin, L., Zhou, M.: Identifying synonyms among distributionally similar words. In: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, IJCAI 2003, pp. 1492–1493 (2003)

    Google Scholar 

  16. Lin, Y., Liu, Z., Sun, M., Liu, Y., Zhu, X.: Learning entity and relation embeddings for knowledge graph completion. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, AAAI 2015, pp. 2181–2187 (2015)

    Google Scholar 

  17. Liu, Y., et al.: Roberta: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019)

    Google Scholar 

  18. Lu, W., Zhang, Z., Yuan, P., Jin, H., Hua, Q.: Learning Chinese word embeddings by discovering inherent semantic relevance in sub-characters. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, CIKM, pp. 1369–1378. ACM (2022)

    Google Scholar 

  19. Lv, X., Hou, L., Li, J., Liu, Z.: Differentiating concepts and instances for knowledge graph embedding. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018, pp. 1971–1979 (2018)

    Google Scholar 

  20. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013, pp. 3111–3119 (2013)

    Google Scholar 

  21. Nguyen, K.A., Walde, S.S.I., Vu, N.T.: Distinguishing antonyms and synonyms in a pattern-based neural network. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017, pp. 76–85 (2017)

    Google Scholar 

  22. Nguyen, K.A., Walde, S.S.I., Vu, N.T.: Integrating distributional lexical contrast into word embeddings for antonym-synonym distinction. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 (2016)

    Google Scholar 

  23. Nickel, M., Tresp, V., Kriegel, H.: A three-way model for collective learning on multi-relational data. In: Proceedings of the 28th International Conference on Machine Learning, ICML 2011, pp. 809–816 (2011)

    Google Scholar 

  24. Qi, F., Yang, C., Liu, Z., Dong, Q., Sun, M., Dong, Z.: Openhownet: an open sememe-based lexical knowledge base. CoRR abs/1901.09957 (2019)

    Google Scholar 

  25. Qiu, Y., Li, H., Li, S., Jiang, Y., Hu, R., Yang, L.: Revisiting correlations between intrinsic and extrinsic evaluations of word embeddings. In: Proceedings of Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data - 17th China National Conference, CCL 2018, and 6th International Symposium, NLP-NABD 2018, vol. 11221, pp. 209–221 (2018)

    Google Scholar 

  26. Roth, M., Walde, S.S.I.: Combining word patterns and discourse markers for paradigmatic relation classification. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, pp. 524–530 (2014)

    Google Scholar 

  27. Samenko, I., Tikhonov, A., Yamshchikov, I.P.: Synonyms and antonyms: Embedded conflict. CoRR abs/2004.12835 (2020)

    Google Scholar 

  28. Scheible, S., Walde, S.S.I., Springorum, S.: Uncovering distributional differences between synonyms and antonyms in a word space model. In: Proceedings of the 6th International Joint Conference on Natural Language Processing, IJCNLP 2013, pp. 489–497 (2013)

    Google Scholar 

  29. Shijia, E., Jia, S., Xiang, Y.: Study on the Chinese word semantic relation classification with word embedding. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Yu. (eds.) NLPCC 2017. LNCS (LNAI), vol. 10619, pp. 849–855. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73618-1_74

    Chapter  Google Scholar 

  30. Socher, R., Huval, B., Manning, C.D., Ng, A.Y.: Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012, pp. 1201–1211. ACL (2012)

    Google Scholar 

  31. Sun, Z., Deng, Z., Nie, J., Tang, J.: Rotate: knowledge graph embedding by relational rotation in complex space. In: Proceedings of the 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May, 2019. OpenReview.net (2019)

    Google Scholar 

  32. Trouillon, T., Welbl, J., Riedel, S., Gaussier, É., Bouchard, G.: Complex embeddings for simple link prediction. In: Proceedings of the 33rd International Conference on Machine Learning, ICML 2016, vol. 48, pp. 2071–2080 (2016)

    Google Scholar 

  33. Wang, J., Zhang, Z., Shi, Z., Cai, J., Ji, S., Wu, F.: Duality-induced regularizer for semantic matching knowledge graph embeddings. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 1652–1667 (2023)

    Article  Google Scholar 

  34. Wang, X., Wu, Z., Li, Y., Huang, Q., Hui, J.: Corpus-based analysis of the co-occurrence of Chinese antonym pairs. In: Proceedings of Advanced Data Mining and Applications - 6th International Conference, ADMA 2010, vol. 6441, pp. 500–507 (2010)

    Google Scholar 

  35. Wang, Z., Zhang, J., Feng, J., Chen, Z.: Knowledge graph embedding by translating on hyperplanes. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2014, pp. 1112–1119 (2014)

    Google Scholar 

  36. Wu, J., Xie, R., Liu, Z., Sun, M.: Knowledge representation via joint learning of sequential text and knowledge graphs. CoRR abs/1609.07075 (2016)

    Google Scholar 

  37. Wu, S.: Iconicity and viewpoint: antonym order in Chinese four-character patterns. Lang. Sci. 59, 117–134 (2017)

    Article  Google Scholar 

  38. Xie, Z., Zeng, N.: A mixture-of-experts model for antonym-synonym discrimination. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, pp. 558–564 (2021)

    Google Scholar 

  39. Yang, B., Yih, W., He, X., Gao, J., Deng, L.: Embedding entities and relations for learning and inference in knowledge bases. In: Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015 (2015)

    Google Scholar 

  40. Yang, S., et al.: Inductive link prediction with interactive structure learning on attributed graph. In: Proceedings of Machine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2021, vol. 12976, pp. 383–398. Springer (2021)

    Google Scholar 

  41. Yu, J., Jian, X., Xin, H., Song, Y.: Joint embeddings of Chinese words, characters, and fine-grained subcharacter components. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, pp. 286–291 (2017)

    Google Scholar 

  42. Zeng, X., Yang, C., Tu, C., Liu, Z., Sun, M.: Chinese LIWC lexicon expansion via hierarchical classification of word embeddings with sememe attention. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), pp. 5650–5657 (2018)

    Google Scholar 

  43. Zhang, J.: Internal semantic structure and conceptual hierarchy of antonymous compounds in modern chinese. In: Proceedings of Chinese Lexical Semantics - 14th Workshop, CLSW 2013, vol. 8229, pp. 181–190 (2013)

    Google Scholar 

  44. Zhang, Z., Zhong, Z., Yuan, P., Jin, H.: Improving entity linking in Chinese domain by sense embedding based on graph clustering. J. Comput. Sci. Technol. 38(1), 196 (2023)

    Article  Google Scholar 

  45. Zhou, Y., Lan, M., Wu, Y.: Effective semantic relationship classification of context-free chinese words with simple surface and embedding features. In: Proceedings of Natural Language Processing and Chinese Computing - 6th CCF International Conference, NLPCC 2017. Lecture Notes in Computer Science, vol. 10619, pp. 456–464. Springer (2017)

    Google Scholar 

Download references

Acknowledgment

The research is supported by The National Natural Science Foundation of China under Grant Nos. 61932004 and 62072205.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hai Jin .

Editor information

Editors and Affiliations

Ethics declarations

Ethical Issues

Following the guidelines from General ethical issues in Machine Learning (https://www.w3.org/TR/webmachinelearning-ethics/#general-ethical-issues-in-machine-learning), we will briefly describe our ethical considerations.

Our work mainly focuses on semantic mining in the Chinese domain, without Bias, Fairness, Security, Privacy, Environmental Impact, and Discrimination against a group or collective. Our method combines external professional linguistic knowledge and has good Transparency and Interpretability.

Our data is sourced from publicly available resources on the internet and all references are cited in the paper, such as Github and other data that follows open-source licenses. Our data does not involve any personal privacy or inference of personal information. Our work is dedicated to researching the potential semantic relationships between words in Chinese and does not have any police or military applications.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Z., Yuan, P., Jin, H. (2023). Exploring Word-Sememe Graph-Centric Chinese Antonym Detection. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14171. Springer, Cham. https://doi.org/10.1007/978-3-031-43418-1_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43418-1_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43417-4

  • Online ISBN: 978-3-031-43418-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics