A Dynamic Word Representation Model Based on Deep Context

Yuan, Xiao; Xiong, Xi; Ju, Shenggen; Xie, Zhengwen; Wang, Jiawei

doi:10.1007/978-3-030-32236-6_60

A Dynamic Word Representation Model Based on Deep Context

Xiao Yuan¹³,
Xi Xiong¹⁴,
Shenggen Ju¹³,
Zhengwen Xie¹³ &
…
Jiawei Wang¹³

Conference paper
First Online: 30 September 2019

4631 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11839))

Abstract

The currently used word embedding techniques use fixed vectors to represent words without the concept of context and dynamics. This paper proposes a deep neural network CoDyWor to model the context of words so that words in different contexts have different vector representations of words. First of all, each layer of the model captures contextual information for each word of the input statement from different angles, such as grammatical information and semantic information, et al. Afterwards, different weights are assigned to each layer of the model through a multi-layered attention mechanism. At last, the information of each layer is integrated to form a dynamic word with contextual information to represent the vector. By comparing different models on the public dataset, it is found that the model’s accuracy in the task of logical reasoning has increased by 2.0%, F1 value in the task of named entity recognition has increased by 0.47%, and F1 value in the task of reading comprehension has increased by 2.96%. The experimental results demonstrate that this technology of word representation enhances the effect of the existing word representation.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Hashimoto, K., Xiong, C., Tsuruoka, Y., et al.: A joint many-task model: growing a neural network for multiple NLP tasks. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1923–1933. Association for Computational Linguistics, Copenhagen (2017)
Google Scholar
Bowman, S.R., Potts, C., Manning, C.D.: Recursive neural networks can learn logical semantics. In: Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality, pp. 12–21. Association for Computational Linguistics, Beijing (2015)
Google Scholar
Nallapati, R., Zhou, B., Gulcehre, C., et al.: Abstractive Text Summarization Using Sequence-To-Sequence RNNs and Beyond, pp. 280–290. Association for Computational Linguistics (2016)
Google Scholar
Xiong, C., Merity, S., Socher, R.: Dynamic memory networks for visual and textual question answering. In: International Conference on Machine Learning, pp. 2397–2406 (2016)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., et al.: Efficient estimation of word representations in vector space. CoRR, abs/1301.3781 (2013)
Google Scholar
Bojanowski, P., Grave, E., Joulin, A., et al.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Article Google Scholar
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 384–394. Association for Computational Linguistics (2010)
Google Scholar
Collobert, R., Weston, J., Bottou, L., et al.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(Aug), 2493–2537 (2011)
MATH Google Scholar
Peters, M.E., Neumann, M., Iyyer, M., et al.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, Stroudsburg (2018)
Google Scholar
Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)
Article Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
Article Google Scholar
McCann, B., Bradbury, J., Xiong, C., et al.: Learned in translation: contextualized word vectors. In: Advances in Neural Information Processing Systems, pp. 6294–6305 (2017)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., et al.: Bert: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805 (2018)
Google Scholar
Radford, A., Narasimhan, K., Salimans, T., et al.: Improving language understanding by generative pre-training (2018). https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/languageunsupervised/languageunderstandingpaper.pdf
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Yosinski, J., Clune, J., Bengio, Y., et al.: How transferable are features in deep neural networks?. In: Advances in Neural Information Processing Systems, pp. 3320–3328 (2014)
Google Scholar
Williams, A., Nangia, N., Bowman, S.R.: A broad-coverage challenge corpus for sentence understanding through inference. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 1112–1122. Association for Computational Linguistics, New Orleans (2018)
Google Scholar
Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 Shared Task: Language-independent Named Entity Recognition, pp. 142–147. Association for Computational Linguistics (2003)
Google Scholar
Rajpurkar, P., Zhang, J., Lopyrev, K., et al.: Squad: 100,000+ Questions for Machine Comprehension of Text, pp. 2383–2392. Association for Computational Linguistics (2016)
Google Scholar

Download references

Acknowledgements

The work was partially supported by the China Postdoctoral Science Foundation under Grant No. 2019M653400; the Sichuan Science and Technology Program under Grant Nos. 2018GZ0253, 2019YFS0236, 2018GZ0182, 2018GZ0093 and 2018GZDZX0039.

Author information

Authors and Affiliations

College of Computer Science, Sichuan University, Chengdu, 610065, China
Xiao Yuan, Shenggen Ju, Zhengwen Xie & Jiawei Wang
School of Cybersecurity, Chengdu University of Information Technology, Chengdu, 610225, China
Xi Xiong

Authors

Xiao Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Xi Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Shenggen Ju
View author publications
You can also search for this author in PubMed Google Scholar
Zhengwen Xie
View author publications
You can also search for this author in PubMed Google Scholar
Jiawei Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xi Xiong .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Jie Tang
National University of Singapore, Singapore, Singapore
Min-Yen Kan
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Sujian Li
Zhengzhou University, Zhengzhou, China
Hongying Zan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yuan, X., Xiong, X., Ju, S., Xie, Z., Wang, J. (2019). A Dynamic Word Representation Model Based on Deep Context. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_60

Download citation

DOI: https://doi.org/10.1007/978-3-030-32236-6_60
Published: 30 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32235-9
Online ISBN: 978-3-030-32236-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)