Skip to main content

Incorporating Named Entity Information into Neural Machine Translation

  • Conference paper
  • First Online:
Natural Language Processing and Chinese Computing (NLPCC 2020)

Abstract

Most neural machine translation (NMT) models normally take the subword-level sequence as input to address the problem of representing out-of-vocabulary words (OOVs). However, using subword units as input may omit the information carried by larger text granularity, such as named entities, which leads to a loss of important semantic information. In this paper, we propose a simple but effective method to incorporate the named entity (NE) tags information into the Transformer translation system. The encoder of our proposed model takes both the subwords and the NE tags of source sentences as inputs. Furthermore, we introduce a novel entity-aligned attention mechanism to make full use of the chunk information of NE tags. The proposed approach can be easily integrated into the existing framework of Transformer. Experimental results on two public translation tasks demonstrate that our proposed method can achieve significant translation improvements over the basic Transformer model and also outperforms the existing competitive systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://www.statmt.org/wmt19/translation-task.html.

  2. 2.

    https://spacy.io/usage/linguistic-features#named-entities.

  3. 3.

    https://github.com/fxsjy/jieba.

  4. 4.

    https://github.com/moses-smt/mosesdecoder/blob/master/scripts/generic/multi-bleu.perl.

  5. 5.

    https://github.com/Kyubyong/transformer.

References

  1. Ba, L.J., Kiros, J.R., Hinton, G.E.: Layer normalization. CoRR abs/1607.06450 (2016)

    Google Scholar 

  2. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: ICLR (2015)

    Google Scholar 

  3. Cettolo, M., Girardi, C., Federico, M.: WIT3: web inventory of transcribed and translated talks. In: Conference of European Association for Machine Translation, pp. 261–268 (2012)

    Google Scholar 

  4. Chen, H., Huang, S., Chiang, D., Dai, X., Chen, J.: Combining character and word information in neural machine translation using a multi-level attention. In: NAACL-HLT, pp. 1284–1293. Association for Computational Linguistics (2018)

    Google Scholar 

  5. Diao, S., Bai, J., Song, Y., Zhang, T., Wang, Y.: ZEN: pre-training Chinese text encoder enhanced by N-gram representations. CoRR abs/1911.00720 (2019)

    Google Scholar 

  6. Gülçehre, Ç., Ahn, S., Nallapati, R., Zhou, B., Bengio, Y.: Pointing the unknown words. In: ACL, vol. 1, The Association for Computer Linguistics (2016)

    Google Scholar 

  7. Hasler, E., de Gispert, A., Iglesias, G., Byrne, B.: Neural machine translation decoding with terminology constraints. In: NAACL-HLT, vol. 2, pp. 506–512. Association for Computational Linguistics (2018)

    Google Scholar 

  8. Huck, M., Hangya, V., Fraser, A.M.: Better OOV translation with bilingual terminology mining. In: ACL, vol. 1, pp. 5809–5815. Association for Computational Linguistics (2019)

    Google Scholar 

  9. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (Poster) (2015)

    Google Scholar 

  10. Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: HLT-NAACL, The Association for Computational Linguistics (2003)

    Google Scholar 

  11. Li, Y., Yu, B., Xue, M., Liu, T.: Enhancing pre-trained Chinese character representation with word-aligned attention. CoRR abs/1911.02821 (2019)

    Google Scholar 

  12. Lopez, A.: Statistical machine translation. ACM Comput. Surv. 40(3), 8:1–8:49 (2008)

    Article  Google Scholar 

  13. Meng, F., Zhang, J.: DTMT: a novel deep transition architecture for neural machine translation. In: AAAI, pp. 224–231. AAAI Press (2019)

    Google Scholar 

  14. Sennrich, R., Haddow, B.: Linguistic input features improve neural machine translation. In: WMT, pp. 83–91. The Association for Computer Linguistics (2016)

    Google Scholar 

  15. Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: ACL, vol. 1. The Association for Computer Linguistics (2016)

    Google Scholar 

  16. So, D.R., Le, Q.V., Liang, C.: The evolved transformer. In: Proceedings of Machine Learning Research, PMLR, ICML, vol. 97, pp. 5877–5886 (2019)

    Google Scholar 

  17. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NIPS, pp. 3104–3112 (2014)

    Google Scholar 

  18. Ugawa, A., Tamura, A., Ninomiya, T., Takamura, H., Okumura, M.: Neural machine translation incorporating named entity. In: COLING, pp. 3240–3250. Association for Computational Linguistics (2018)

    Google Scholar 

  19. Vaswani, A., et al.: Attention is all you need. In: NIPS, pp. 5998–6008 (2017)

    Google Scholar 

  20. Vinyals, O., Fortunato, M., Jaitly, N.: Pointer networks. In: NIPS, pp. 2692–2700 (2015)

    Google Scholar 

  21. Wang, X., Tu, Z., Xiong, D., Zhang, M.: Translating phrases in neural machine translation. In: EMNLP, pp. 1421–1431. Association for Computational Linguistics (2017)

    Google Scholar 

  22. Xiao, F., Li, J., Zhao, H., Wang, R., Chen, K.: Lattice-based transformer encoder for neural machine translation. In: ACL, vol. 1, pp. 3090–3097. Association for Computational Linguistics (2019)

    Google Scholar 

  23. Yu, D., Wang, H., Chen, P., Wei, Z.: Mixed pooling for convolutional neural networks. In: Miao, D., Pedrycz, W., Ślȩzak, D., Peters, G., Hu, Q., Wang, R. (eds.) RSKT 2014. LNCS (LNAI), vol. 8818, pp. 364–375. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11740-9_34

    Chapter  Google Scholar 

Download references

Acknowledgments

This research work has been funded by the National Key Research and Development Program of China NO. 2016QY03D0604 and NO. 2018YFC0830803, the National Natural Science Foundation of China (Grant No. 61772337).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Kui Meng or Gongshen Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhou, L., Lu, W., Zhou, J., Meng, K., Liu, G. (2020). Incorporating Named Entity Information into Neural Machine Translation. In: Zhu, X., Zhang, M., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2020. Lecture Notes in Computer Science(), vol 12430. Springer, Cham. https://doi.org/10.1007/978-3-030-60450-9_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-60450-9_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-60449-3

  • Online ISBN: 978-3-030-60450-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics