A Reinforcement Learning Approach for Abductive Natural Language Generation

Huang, Hongru

doi:10.1007/978-3-030-92238-2_6

Hongru Huang¹³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 13110))

Included in the following conference series:

International Conference on Neural Information Processing

1774 Accesses

Abstract

Teaching deep learning models commonsense knowledge is a crucial yet challenging step towards building human-level artificial intelligence. Abductive Commonsense Reasoning ($\mathcal {ART}$) is a benchmark that investigates model’s ability on inferencing the most plausible explanation within the given context, which requires model using commonsense knowledge about the world. $\mathcal {ART}$ consists of two datasets, $\alpha $NLG and $\alpha $NLI, that challenge models from generative and discriminative settings respectively. Despite the fact that both of the datasets investigate the same ability, existing work solves them independently. In this work, we address $\alpha $NLG in a teacher-student setting by getting help from another model with adequate commonsense knowledge fully-trained on $\alpha $NLI. We fulfill this intuition by representing the desired optimal generation model as an Energy-Based Model and training it using a reinforcement learning algorithm. Experiment results showed that our model achieve state-of-the-art results on both automatic and human evaluation metrics, which have demonstrated the effectiveness and feasibility of our model (Code available in https://github.com/Huanghongru/commonsense-generation).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Enhancing text generation in joint Nlg/Nlu learning through curriculum learning, semi-supervised training, and advanced optimization techniques

Article 12 February 2025

ExperienceGen 1.0: A Text Generation Challenge Which Requires Deduction and Induction Ability

Demystifying ChatGPT: An In-depth Survey of OpenAI’s Robust Large Language Models

Article 18 June 2024

Notes

1.
We use $a(\boldsymbol{x})$ for short in the rest of the paper.
2.
We use $\phi (\boldsymbol{x})$ for short in the rest of the paper.
3.
https://leaderboard.allenai.org/anli/submissions/public.

References

Bahdanau, D., et al.: An actor-critic algorithm for sequence prediction. In: 5th International Conference on Learning Representations (Conference Track Proceedings), ICLR 2017, Toulon, France, 24–26 April 2017. OpenReview.net (2017). https://openreview.net/forum?id=SJDaqqveg
Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, Michigan, pp. 65–72. Association for Computational Linguistics (June 2005). https://www.aclweb.org/anthology/W05-0909
Bhagavatula, C., et al.: Abductive commonsense reasoning. OpenReview.net (2020). https://openreview.net/forum?id=Byg1v1HKDB
Deng, Y., Bakhtin, A., Ott, M., Szlam, A., Ranzato, M.: Residual energy-based models for text generation. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020). https://openreview.net/forum?id=B1l4SgHKDH
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, pp. 4171–4186. Association for Computational Linguistics (June 2019). https://doi.org/10.18653/v1/N19-1423. https://www.aclweb.org/anthology/N19-1423
He, P., Liu, X., Gao, J., Chen, W.: DeBERTa: decoding-enhanced BERT with disentangled attention. CoRR abs/2006.03654 (2020). arXiv arXiv:2006.03654
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002). https://doi.org/10.1162/089976602760128018
Ji, H., Ke, P., Huang, S., Wei, F., Zhu, X., Huang, M.: Language generation with multi-hop reasoning on commonsense knowledge graph. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 725–736. Association for Computational Linguistics (November 2020). https://doi.org/10.18653/v1/2020.emnlp-main.54. https://www.aclweb.org/anthology/2020.emnlp-main.54
Khalifa, M., Elsahar, H., Dymetman, M.: A distributional approach to controlled text generation. CoRR abs/2012.11635 (2020). arXiv arXiv:2012.11635
Kim, T., Bengio, Y.: Deep directed generative models with energy-based probability estimation. CoRR abs/1606.03439 (2016). arXiv arXiv:1606.03439
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations (Conference Track Proceedings), ICLR 2015, San Diego, CA, USA, 7–9 May 2015 (2015). arXiv arXiv:1412.6980
LeCun, Y., Chopra, S., Hadsell, R., Huang, F.J., et al.: A tutorial on energy-based learning. In: Predicting Structured Data. MIT Press (2006)
Google Scholar
Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7871–7880. Association for Computational Linguistics (July 2020). https://doi.org/10.18653/v1/2020.acl-main.703. https://www.aclweb.org/anthology/2020.acl-main.703
Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, Barcelona, Spain, pp. 74–81. Association for Computational Linguistics (July 2004). https://www.aclweb.org/anthology/W04-1013
Liu, Y., et al: RoBERTa: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019). arXiv arXiv:1907.11692
Luketina, J., et al.: A survey of reinforcement learning informed by natural language. In: Kraus, S. (ed.) Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, 10–16 August 2019, pp. 6309–6317. ijcai.org (2019). https://doi.org/10.24963/ijcai.2019/880
Moore, C.: The development of commonsense psychology (2006). https://doi.org/10.4324/9780203843246
Owen, A.B.: Monte Carlo theory, methods and examples (2013). https://statweb.stanford.edu/~owen/mc/Ch-var-is.pdf
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, pp. 311–318. Association for Computational Linguistics (July 2002). https://doi.org/10.3115/1073083.1073135. https://www.aclweb.org/anthology/P02-1040
Parshakova, T., Andreoli, J.M., Dymetman, M.: Global autoregressive models for data-efficient sequence learning. In: Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), Hong Kong, China, pp. 900–909. Association for Computational Linguistics (November 2019). https://doi.org/10.18653/v1/K19-1084. https://www.aclweb.org/anthology/K19-1084
Qin, L., et al.: Back to the future: unsupervised backprop-based decoding for counterfactual and abductive commonsense reasoning. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 794–805. Association for Computational Linguistics (November 2020). https://doi.org/10.18653/v1/2020.emnlp-main.58. https://www.aclweb.org/anthology/2020.emnlp-main.58
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2018). https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf
Ranzato, M., Chopra, S., Auli, M., Zaremba, W.: Sequence level training with recurrent neural networks. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations (Conference Track Proceedings), ICLR 2016, San Juan, Puerto Rico, 2–4 May 2016 (2016). arXiv arXiv:1511.06732
Sap, M., et al.: ATOMIC: an atlas of machine commonsense for if-then reasoning. In: The 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, The 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, 27 January–1 February 2019, pp. 3027–3035. AAAI Press (2019). https://doi.org/10.1609/aaai.v33i01.33013027
Speer, R., Chin, J., Havasi, C.: ConceptNet 5.5: an open multilingual graph of general knowledge, pp. 4444–4451. AAAI Press (2017). http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14972
Tambwekar, P., Dhuliawala, M., Martin, L.J., Mehta, A., Harrison, B., Riedl, M.O.: Controllable neural story plot generation via reinforcement learning (2019)
Google Scholar
Tincoff, R., Jusczyk, P.W.: Some beginnings of word comprehension in 6-month-olds. Psycholol. Sci. 10(2), 172–175 (1999). https://doi.org/10.1111/1467-9280.00127
Vedantam, R., Zitnick, C.L., Parikh, D.: CIDEr: consensus-based image description evaluation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 4566–4575. IEEE Computer Society (2015). https://doi.org/10.1109/CVPR.2015.7299087
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 229–256 (1992). https://doi.org/10.1007/BF00992696
Yang, Z., Xie, Y., Wang, Z.: A theoretical analysis of deep q-learning. CoRR abs/1901.00137 (2019). arXiv arXiv:1901.00137
Zhu, Y., Pang, L., Lan, Y., Cheng, X.: L2r${^2}$: leveraging ranking for abductive reasoning, pp. 1961–1964. ACM (2020). https://doi.org/10.1145/3397271.3401332
Ziegler, D.M., et al.: Fine-tuning language models from human preferences. CoRR abs/1909.08593 (2019). arXiv arXiv:1909.08593

Download references

Author information

Authors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Hongru Huang

Authors

Hongru Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongru Huang .

Editor information

Editors and Affiliations

Sampoerna University, Jakarta, Indonesia
Teddy Mantoro
Kyungpook National University, Daegu, Korea (Republic of)
Minho Lee
Sampoerna University, Jakarta, Indonesia
Media Anugerah Ayu
Murdoch University, Murdoch, WA, Australia
Kok Wai Wong
Universitas Indonesia, Depok, Indonesia
Achmad Nizar Hidayanto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, H. (2021). A Reinforcement Learning Approach for Abductive Natural Language Generation. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Lecture Notes in Computer Science(), vol 13110. Springer, Cham. https://doi.org/10.1007/978-3-030-92238-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-92238-2_6
Published: 05 December 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92237-5
Online ISBN: 978-3-030-92238-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Reinforcement Learning Approach for Abductive Natural Language Generation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Enhancing text generation in joint Nlg/Nlu learning through curriculum learning, semi-supervised training, and advanced optimization techniques

ExperienceGen 1.0: A Text Generation Challenge Which Requires Deduction and Induction Ability

Demystifying ChatGPT: An In-depth Survey of OpenAI’s Robust Large Language Models

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Reinforcement Learning Approach for Abductive Natural Language Generation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Enhancing text generation in joint Nlg/Nlu learning through curriculum learning, semi-supervised training, and advanced optimization techniques

ExperienceGen 1.0: A Text Generation Challenge Which Requires Deduction and Induction Ability

Demystifying ChatGPT: An In-depth Survey of OpenAI’s Robust Large Language Models

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation