Skip to main content

Nucleus Beam Search for Machine Translation Decoding

  • Conference paper
  • First Online:
Advanced Intelligent Computing Technology and Applications (ICIC 2023)

Abstract

Beam search is the most widely-used decoding algorithm for machine translation. Its success, however, may be attributed to the inadvertent implementation of the Uniform Information Density (UID) hypothesis. The UID hypothesis suggests that humans prefer sentences with evenly distributed information across the linguistic signal, while adhering to grammatical constraints. This paper presents Nucleus Beam Search, a novel machine translation decoding algorithm aimed at achieving the UID objective. By combining nucleus filtering with beam search, our approach effectively expands the search space without violating the UID hypothesis, enabling the generation of lengthier and more com prehensive translations. Experimental results reveal that Nucleus Beam Search outperforms traditional decoding algorithms in terms of BLEU, METEOR, ROUGE-L and CIDEr scores. Nevertheless, our findings also suggest that information density is not the sole determinant of translation quality, with beamwidth playing a significant role as well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Boulanger-Lewandowski, N., Bengio, Y., Vincent, P.: Audio chord recognition with recurrent neural networks. In: Proceedings of the 14th International Society for Music Information Retrieval Conference, ISMIR 2013 pp. 335–340 (2013)

    Google Scholar 

  2. Caccia, M., Caccia, L., Fedus, W., Larochelle, H., Pineau, J., Charlin, L.: Language GANs falling short. In: Proceedings of the 8th International Conference on Learning Representations (2020). http://arxiv.org/abs/1811.02549

  3. Cohen, E., Beck, J.C.: Empirical analysis of beam search performance degradation in neural sequence models. In: 36th International Conference on Machine Learning, ICML 2019 2019-June, pp. 2294–2312 (2019)

    Google Scholar 

  4. Fan, A., Lewis, M., Dauphin, Y.: Hierarchical neural story generation. ACL 2018- 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) 1, 889–898 (2018)

    Google Scholar 

  5. Feng, V.W., Hirst, G.: Text-level discourse parsing with rich linguistic features. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 60–68 (2012)

    Google Scholar 

  6. Ficler, J., Goldberg, Y.: Controlling linguistic style aspects in neural language generation. In: Proceedings of the Workshop on Stylistic Variation, pp. 94–104 (2017)

    Google Scholar 

  7. Graves, A.: Sequence Transduction with Recurrent Neural Networks. arXiv preprint arXiv:1211.3711 (2012). http://arxiv.org/abs/1211.3711

  8. He, W., He, Z., Wu, H., Wang, H.: Improved neural machine translation with SMT features. In: 30th AAAI Conference on Artificial Intelligence, AAAI 2016, no. 10, pp.151–157 (2016)

    Google Scholar 

  9. Holtzman, A., Buys, J., Du, L., Forbes, M., Choi, Y.: The curious case of neural text degeneration. In: The International Conference on Learning Representations (ICLR) (2020)

    Google Scholar 

  10. Jaeger, T., Levy, R.: Speakers optimize information density through syntactic reduction. Adv. Neural. Inf. Process. Syst. 19, 849–856 (2007)

    Google Scholar 

  11. Jean, S., Firat, O., Cho, K., Memisevic, R., Bengio, Y.: Montreal neural machine translation systems for wmt’15. In: 10th Workshop on Statistical Machine Translation, WMT 2015 at the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015 – Proceedings, pp. 134–140 (2015)

    Google Scholar 

  12. Koehn, P., Knowles, R.: Six Challenges for Neural Machine Translation. First Workshop on Neural Machine Translation pp. 28–39 (2017)

    Google Scholar 

  13. Lukasik, M., et al.: Semantic label smoothing for sequence to sequence problems. In: EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (2016), pp. 4992–4998 (2020)

    Google Scholar 

  14. Luong, M.T., Sutskever, I., Le, Q.V., Vinyals, O., Zaremba, W.: Addressing the rare word problem in neural machine translation. In: ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th Inter- national Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference, vol. 1, pp. 11–19 (2015)

    Google Scholar 

  15. Meister, C., Cotterell, R., Vieira, T.: If beam search is the answer, what was the question? In: EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp. 2173–2185 (2020)

    Google Scholar 

  16. Meister, C., Pimentel, T., Haller, P., Jäger, L., Cotterell, R., Levy, R.: Revisiting the uniform information density hypothesis. In: EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (i), pp. 963–980 (2021)

    Google Scholar 

  17. Mohamed, S.A., Elsayed, A.A., Hassan, Y.F., Abdou, M.A.: Neural machine translation: past, present, and future. Neural Comput. Appl. 33(23), 15919–15931 (2021). https://doi.org/10.1007/s00521-021-06268-0

    Article  Google Scholar 

  18. Murray, K., Chiang, D.: Correcting length bias in neural machine translation. In: WMT 2018 - 3rd Conference on Machine Translation, Proceedings of the Conference, vol. 1, pp. 212–223 (2018)

    Google Scholar 

  19. Peters, B., Martins, A.F.T.: Smoothing and shrinking the Sparse Seq2Seq search space. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 2642–2654 (2021)

    Google Scholar 

  20. Shaham, U., Levy, O.: What do you get when you cross beam search with nucleus sampling? In: Proceedings of the Third Workshop on Insights from Negative Results in NLP, Dublin, Ireland, pp. 38–45. Association for Computational Linguistics, May 2022. https://doi.org/10.18653/v1/2022.insights-1.5, https://aclanthology.org/2022.insights-1.5

  21. Smith, N.J., Levy, R.: The effect of word predictability on reading time is logarithmic. Cognition 128(3), 302–319 (2013)

    Article  Google Scholar 

  22. Stahlberg, F.: Neural machine translation: a review. J. Artif. Intell. Res. 69, 343–418 (2020)

    Article  MathSciNet  Google Scholar 

  23. Stahlberg, F., Byrne, B.: On NMT search errors and model errors: cat got your tongue? In: EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference, pp. 3356–3362 (2019)

    Google Scholar 

  24. Tiedemann, J., Thottingal, S.: OPUS-MT — building open translation services for the world. In: Proceedings of the 22nd Annual Conference of the European Association for Machine Translation (EAMT). Lisbon, Portugal (2020)

    Google Scholar 

  25. Tily, H., Piantadosi, S.: Refer efficiently: Use less informative expressions for more predictable meanings. In: Proceedings of the Workshop on the Production of Referring Expressions: Bridging the Gap Between Computational and Empirical Approaches to Reference (2009)

    Google Scholar 

  26. Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 pp. 1–23 (2016). http://arxiv.org/abs/1609.08144

  27. Yang, S., Wang, Y., Chu, X.: A survey of deep learning techniques for neural machine translation. arXiv preprint arXiv:2002.07526 (2020). http://arxiv.org/abs/2002.07526

  28. Yang, Y., Huang, L., Ma, M.: Breaking the beam search curse: a study of (re)scoring methods and stopping criteria for neural machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018, pp. 3054–3059 (2018)

    Google Scholar 

Download references

Acknowledgements

This work is supported by Sichuan Science and Technology Program (2022ZHCG0007), and the Natural Science Foundation of Sichuan Province (2022NSFSC0503).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zheng Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, Z., Tao, R., Wang, Y. (2023). Nucleus Beam Search for Machine Translation Decoding. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14089. Springer, Singapore. https://doi.org/10.1007/978-981-99-4752-2_49

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-4752-2_49

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-4751-5

  • Online ISBN: 978-981-99-4752-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics