Skip to main content
Log in

Exploring Chinese word embedding with similar context and reinforcement learning

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Chinese word embedding has attracted considerable attention in the field of natural language processing. Existing methods model the relation between target and neighbouring contextual words. However, with the phenomenon of irrelevant neighbouring words in Chinese, these methods are limited in capturing and understanding the semantics of Chinese words. In this study, we designed sc2vec to explore Chinese word embeddings by proposing a similar context to reduce the influence of the above problem and comprehend relevant semantics of Chinese words. Meanwhile, to enhance the learning architecture, sc2vec was modelled with reinforcement learning to generate high-quality Chinese word embeddings, regarding continuous bag-of-words and skip-gram models as two actions of an agent over a corpus. The results on word analogy, word similarity, named entity recognition, and text classification tasks demonstrate that the proposed model outperforms most state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

The text data used to support the findings of this study are in http://www.sogou.com/labs/resource/ca.php

Code availability

The code is written in Python with PyTorch. The code used in this study is planned for release after the paper is accepted.

Notes

  1. https://en.wikipedia.org/wiki/Chinese_language.

  2. https://en.wikipedia.org/wiki/Xinhua_Zidian.

  3. \(\begin{aligned} E(b\nabla \log p_{\theta } (\tau )) & = \sum\limits_{\tau } {p_{\theta } (\tau )} \nabla \log p_{\theta } (\tau )b \\ & = \sum\limits_{\tau } {p_{\theta } (\tau )\frac{{\nabla p_{\theta } (\tau )b}}{{p_{\theta } (\tau )}}} \\ & = \sum\limits_{\tau } {\nabla p_{\theta } (\tau )b} \\ & = \nabla (\sum\limits_{\tau } {p_{\theta } (\tau ))b} = \nabla _{\theta } b = 0. \\ \end{aligned}\)

  4. By setting \(F=(R\left( \tau ^n\right) -b)\nabla \mathrm {log}p_\theta \left( \tau \right)\), its variance is \(Var\left( F\right) ={E(F-E(F))}^2={E(F}^2)-E({E(F)}^2)\). We want to obtain the minimum of the variance; thus, \(\frac{\partial Var(F)}{\partial b}=0\). As \(E({E(F)}^2)\) is unrelated to b, \(\frac{\partial Var(F)}{\partial b}=E\left( F\frac{\partial F}{\partial b}\right) =0\), and we can obtain b as Eq. (11).

  5. http://www.sogou.com/labs/resource/ca.php.

  6. https://github.com/fxsjy/jieba.

  7. https://github.com/BYVoid/OpenCC.

  8. https://github.com/yzwww2019/Sighan-2006-NER-dataset.

  9. https://github.com/yzwww2019/Fudan-corpus.

References

  1. Xiong ZY, Qin K, Yang HB, Luo GC (2020) Learning Chinese word representation better by cascade morphological n-gram. Neural Comput Appl. https://doi.org/10.1007/s00521-020-05198-7

    Article  Google Scholar 

  2. Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa PP (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537. https://doi.org/10.5555/1953048.2078186

    Article  MATH  Google Scholar 

  3. Senel LK, Utlu I, Yucesoy V, Koc A, Cukur T (2018) Semantic structure and interpretability of word embeddings. IEEE ACM Trans Audio Speech Lang Process 26(10):1769–1779. https://doi.org/10.1109/TASLP.2018.2837384

    Article  Google Scholar 

  4. Lu GQ, Gan JZ, Yin J, Luo ZP, Li B, Zhao XS (2020) Multi-task learning using a hybrid representation for text classification. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3934-y

    Article  Google Scholar 

  5. Wang GY, Li CY, Wang WL, Zhang YZ, Shen DH, Zhang XY, Henao R, Carin L (2018) Joint embedding of words and labels for text classification. In: Proceedings of annual meeting of the association for computational linguistics (ACL), pp 2321–2331. https://doi.org/10.18653/v1/P18-1216

  6. Gaur B, Saluja GS, Sivakumar HB, Singh S (2020) Semi-supervised deep learning based named entity recognition model to parse education section of resumes. Neural Comput Appl. https://doi.org/10.1007/s00521-020-05351-2

    Article  Google Scholar 

  7. Yu LC, Jin W, Lai KR, Zhang XJ (2018) Refining word embeddings using intensity scores for sentiment analysis. IEEE ACM Trans Audio Speech Lang Process 26(3):671–681. https://doi.org/10.1109/TASLP.2017.2788182

    Article  Google Scholar 

  8. Lei M, Huang HY, Feng C (2020) Multi-granularity semantic representation model for relation extraction. Neural Comput Appl. https://doi.org/10.1007/s00521-020-05464-8

    Article  Google Scholar 

  9. Mikolov T, Chen K, Corrado GS, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781

  10. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of conference on neural information processing systems (NIPS), pp 3111–3119. https://doi.org/10.5555/2999792.2999959

  11. Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of conference on empirical methods in natural language processing (EMNLP), pp 1532–1543. https://doi.org/10.3115/v1/d14-1162

  12. Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5(1):135–146. https://doi.org/10.1162/tacl_a_00051

    Article  Google Scholar 

  13. Chen XX, Lei X, Liu ZY, Sun MS, Luan HB (2015) Joint learning of character and word embeddings. In: Proceedings of international joint conference on artificial intelligence (IJCAI), pp 1236–1242. https://doi.org/10.5555/2832415.2832421

  14. Cao SS, Lu W, Zhou J, Li XL (2018) Cw2vec: Learning Chinese word embeddings with stroke n-gram information. In: Proceedings of AAAI conference on artificial intelligence (AAAI), pp 5053–5061

  15. Harris ZS (1954) Distributional structure. Papers in structural and transformational linguistics

  16. Hindle D (1990) Noun classification from predicate-argument structures. In: Proceedings of Annual meeting of the association for computational linguistics (ACL), pp 268–275. https://doi.org/10.3115/981823.981857

  17. Vashishth S, Bhandari M, Yadav P, Rai P, Bhattacharyya C, Talukdar P (2019) Incorporating syntactic and semantic information in word embeddings using graph convolutional networks. In: Proceedings of annual meeting of the association for computational linguistics (ACL), pp 3308–3318. https://doi.org/10.18653/v1/p19-1320

  18. Yu JX, Jian X, Xin H, Song YQ (2017) Joint embeddings of Chinese words, characters, and fine-grained subcharacter components. In: Proceedings of conference on empirical methods in natural language processing (EMNLP), pp 286–291. https://doi.org/10.18653/v1/d17-1027

  19. Su TR, Lee HY (2017) Learning Chinese word representations from glyphs of characters. In: Proceedings of conference on empirical methods in natural language processing (EMNLP), pp 264–273. https://doi.org/10.18653/v1/d17-1025

  20. Zhang Y, Liu Y, Zhu J, Wu X (2021) FSPRM: a feature subsequence based probability representation model for Chinese word embedding. IEEE ACM Trans Audio Speech Lang Process 29:1702–1716. https://doi.org/10.1109/TASLP.2021.3073868

    Article  Google Scholar 

  21. Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of conference of the north American chapter of the association for computational linguistics (NAACL), pp 4171–4186. https://doi.org/10.18653/v1/n19-1423

  22. Basta C, Costa-jussa MR, Casas N (2020) Extensive study on the underlying gender bias in contextualized word embeddings. Neural Comput Appl. https://doi.org/10.1007/s00521-020-05211-z

    Article  Google Scholar 

  23. Song Y, Shi SM, Li J (2018) Joint learning embeddings for Chinese words and their components via ladder structured networks. In: Proceedings of international joint conference on artificial intelligence (IJCAI), pp 4375–4381. https://doi.org/10.24963/ijcai.2018/608

  24. Chen YC, Bansal M (2018) Fast abstractive summarization with reinforce-selected sentence rewriting. In: Proceedings of annual meeting of the association for computational linguistics (ACL), pp 675–686. https://doi.org/10.18653/v1/P18-1063

  25. Song Y, Shi SM (2018) Complementary learning of word embeddings. In: Proceedings of international joint conference on artificial intelligence (IJCAI), pp 4368–4374. https://doi.org/10.24963/ijcai.2018/607

  26. Yang L, Chen XX, Liu ZY, Sun MS (2017) Improving word representations with document labels. IEEE ACM Trans Audio Speech Lang Process 25(4):863–870. https://doi.org/10.1109/TASLP.2017.2658019

    Article  Google Scholar 

  27. Camacho-Collados J, Espinosa-Anke L, Jameel S, Schockaert S (2019) A latent variable model for learning distributional relation vectors. In: Proceedings of international joint conference on artificial intelligence (IJCAI), pp 4911–4917. https://doi.org/10.24963/ijcai.2019/682

  28. Camacho-Collados J, Espinosa Anke L, Schockaert S (2019) Relational word embeddings. In: Proceedings of annual meeting of the association for computational linguistics (ACL), pp 3286–3296. https://doi.org/10.18653/v1/P19-1318

  29. Sun X, Gao Y, Sutcliffe R, Guo SX, Wang X, Feng J (2021) Word representation learning based on bidirectional GRUs with drop loss for sentiment classification. IEEE Trans Syst Man Cybern Syst 51(7):4532–4542. https://doi.org/10.1109/TSMC.2019.2940097

    Article  Google Scholar 

  30. Meng YX, Wu W, Wang F, Li XY, Nie P, Yin F, Li MY, Han QH, Sun XF, Li JW (2019) Glyce: Glyph-vectors for Chinese character representations. In: Proceedings of advances in neural information processing systems (NIPS), pp 2742–2753

  31. Ma B, Qi Q, Liao JX, Sun HF, Wang JY (2020) Learning Chinese word embeddings from character structural information. Comput Speech Lang 60:101031. https://doi.org/10.1016/j.csl.2019.101031

    Article  Google Scholar 

  32. Zhang Y, Liu YG, Zhu JJ, Zheng ZQ, Liu XF, Wang WG, Chen ZJ, Zhai SQ (2019) Learning Chinese word embeddings from stroke, structure and pinyin of characters. In: Proceedings of Conference on information and knowledge management (CIKM), pp 1011–1020. https://doi.org/10.1145/3357384.3358005

  33. Wang S, Zhou W, Zhou Q (2020) Radical and stroke-enhanced Chinese word embeddings based on neural networks. Neural Process Lett 52(2):1109–1121. https://doi.org/10.1007/s11063-020-10289-6

    Article  Google Scholar 

  34. Yang Q, Xie H, Cheng G, Wang FL, Rao Y (2021) Pronunciation-enhanced Chinese word embedding. Cogn Comput 13(3):688–697. https://doi.org/10.1007/s12559-021-09850-9

    Article  Google Scholar 

  35. Chen H, Yu S, Lin S (2020) Glyph2vec: Learning Chinese out-of-vocabulary word embedding from glyphs. In: Proceedings of the annual meeting of the association for computational linguistics (ACL), pp 2865–2871. https://doi.org/10.18653/v1/2020.acl-main.256

  36. Liao X, Huang Y, Wei C, Zhang C, Deng Y, Yi K (2021) Efficient estimate of low-frequency words’ embeddings based on the dictionary: a case study on Chinese. Appl Sci Basel 11(22):11018. https://doi.org/10.3390/app112211018

    Article  Google Scholar 

  37. Ye F, Qin Z (2018) Research on pattern representation based on keyword and word embedding in Chinese entity relation extraction. J Adv Comput Intell Intell Inf 22(4):475–482

    Article  Google Scholar 

  38. Huang DG, Pei JH, Zhang C, Huang KY, Ma JJ (2018) Incorporating prior knowledge into word embedding for Chinese word similarity measurement. ACM Trans Asian Low Resour Lang Inf Process (TALLIP) 17(3):1–21

    Google Scholar 

  39. Chen C, Gao Y, Ye M, Guo Y (2020) Research on disease identification in Chinese domain based on word embedding technology. In: IEEE international conference on consumer electronics, pp 1–2. https://doi.org/10.1109/ICCE-Taiwan49838.2020.9258316

  40. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533. https://doi.org/10.1038/nature14236

    Article  Google Scholar 

  41. Feng J, Huang ML, Zhao L, Yang Y, Zhu XY (2018) Reinforcement learning for relation classification from noisy data. In: Proceedings of AAAI conference on artificial intelligence (AAAI), pp 5779–5786

  42. Tavares AR, Anbalagan S, Marcolino LS, Chaimowicz L (2018) Algorithms or actions? A study in large-scale reinforcement learning. In: Proceedings of international joint conference on artificial intelligence (IJCAI), pp 2717–2723. https://doi.org/10.24963/ijcai.2018/377

  43. Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: a brief survey. IEEE Signal Process Mag 34(6):26–38. https://doi.org/10.1109/MSP.2017.2743240

    Article  Google Scholar 

  44. Sutton RS, Mcallester D, Singh S, Mansour Y (2000) Policy gradient methods for reinforcement learning with function approximation. In: Proceedings of conference on neural information processing systems (NIPS), pp 1057–1063

  45. Ciosek K, Whiteson S (2018) Expected policy gradients. In: Proceedings of AAAI conference on artificial intelligence (AAAI), pp 2868–2875

  46. Ashok A, Rhinehart N, Beainy F, Kitani KM (2018) N2N learning: Network to network compression via policy gradient reinforcement learning. In: Proceedings of international conference on learning representations (ICLR)

  47. Zoph B, Le QV (2017) Neural architecture search with reinforcement learning. In: Proceedings of international conference on learning representations (ICLR)

  48. Jin P, Wu YF (2012) Semeval-2012 task 4: evaluating Chinese word similarity. In: Proceedings of joint conference on lexical and computational semantics, pp 1236–1242

  49. Xu J, Liu JW, Zhang LG, Li ZY, Chen HH (2016) Improve Chinese word embeddings by exploiting internal structure. In: Proceedings of annual meeting of the association for computational linguistics (ACL), pp 1041–1050. https://doi.org/10.18653/v1/n16-1119

  50. Levow G (2006) The third international Chinese language processing bakeoff: word segmentation and named entity recognition. In: Proceedings of workshop on chinese language processing, pp 108–117

  51. Ma XZ, Hovy E (2016) End-to-end sequence labeling via bi-directional lstm-cnns-crf. arXiv preprint arXiv:1603.01354

  52. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of conference on empirical methods in natural language processing (EMNLP), pp 1746–1751. https://doi.org/10.3115/v1/d14-1181

  53. Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Proceedings of international conference on machine learning (ICML)

Download references

Acknowledgements

This research was supported in part by the National Key R&D Program of China under grants 2017YFC1703905 and 2018YFC1704105, the Natural Science Foundation of Sichuan Province under grant 2022NSFSC0958, the Sichuan Science and Technology Program under grants 2020YFS0372 and 2020YFS0302, and the Fundamental Research Funds for the Central Universities ZYGX2021YGLH012. We would like to thank Editage (www.editage.cn) for English language editing.

Author information

Authors and Affiliations

Authors

Contributions

YZ, YL performed conceptualization; YZ did methodology; DL and SZ done formal analysis and investigation; YZ contributed to writing—original draft preparation; YL, DL, and SZ were involved in writing—review and editing; YL did funding acquisition; YZ done resources; YL, DL, and SZ supervised the study;

Corresponding author

Correspondence to Yongguo Liu.

Ethics declarations

Conflict of interest

The authors declare that there are no conflicts of interest regarding the publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Liu, Y., Li, D. et al. Exploring Chinese word embedding with similar context and reinforcement learning. Neural Comput & Applic 34, 22287–22302 (2022). https://doi.org/10.1007/s00521-022-07672-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07672-w

Keywords

Navigation