skip to main content
10.1145/3486622.3494000acmconferencesArticle/Chapter ViewAbstractPublication PageswiConference Proceedingsconference-collections
short-paper

Recycling Numeracy Data Augmentation with Symbolic Verification for Math Word Problem Solving

Published: 13 April 2022 Publication History

Abstract

Most studies of automatic math word problem solving rely on a dataset for training the model that transforms a question into the corresponding answer directly, or translates the question into a sequence of operations that form a program to derive the answer. The program serving as the intermediate symbolic form between the question and the answer provides more information for the model to learn arithmetic reasoning. However, manually composing the programs for numerous questions is a labor-intensive work, resulting in only one medium-sized dataset, MathQA, is available. This work proposes a novel recycling numeracy data augmentation (RNDA) approach that automatically generates high quality training instances in the MathQA style. Experimental results show that the model trained on the augmented data achieves the state-of-the-art performance. We will release the dataset as a resource for the research community.

References

[1]
Aida Amini, Saadia Gabriel, Shanchuan Lin, Rik Koncel-Kedziorski, Yejin Choi, and Hannaneh Hajishirzi. 2019. MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2357–2367.
[2]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1409.0473
[3]
Yefim Bakman. 2007. Robust understanding of word problems with extraneous information. arXiv preprint math/0701393(2007).
[4]
Daniel G. Bobrow. 1964. Natural Language Input for a Computer Problem Solving System. Technical Report. USA.
[5]
Diane J Briars and Jill H Larkin. 1984. An integrated model of skill in solving elementary word problems. Cognition and instruction 1, 3 (1984), 245–296.
[6]
Kezhen Chen, Qiuyuan Huang, Hamid Palangi, Paul Smolensky, Kenneth D Forbus, and Jianfeng Gao. 2020. Mapping natural-language problems to formal-language solutions using structured neural representations. In Proc. of ICML, Vol. 2020.
[7]
Xinyun Chen, Chen Liang, Adams Wei Yu, Denny Zhou, Dawn Song, and Quoc V Le. 2019. Neural symbolic reader: Scalable integration of distributed and symbolic representations for reading comprehension. In International Conference on Learning Representations.
[8]
Andrew M Dai and Quoc V Le. 2015. Semi-supervised Sequence Learning. In Advances in Neural Information Processing Systems 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (Eds.). Curran Associates, Inc., 3079–3087. http://papers.nips.cc/paper/5949-semi-supervised-sequence-learning.pdf
[9]
Dheeru Dua, Yizhong Wang, Pradeep Dasigi, Gabriel Stanovsky, Sameer Singh, and Matt Gardner. 2019. DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs. In Proc. of NAACL.
[10]
Mor Geva, Ankit Gupta, and Jonathan Berant. 2020. Injecting Numerical Reasoning Skills into Language Models. In ACL.
[11]
Charles Hinson, Hen-Hsen Huang, and Hsin-Hsi Chen. 2020. Heterogeneous Recycle Generation for Chinese Grammatical Error Correction. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), 2191–2201. https://doi.org/10.18653/v1/2020.coling-main.199
[12]
Mohammad Javad Hosseini, Hannaneh Hajishirzi, Oren Etzioni, and Nate Kushman. 2014. Learning to solve arithmetic word problems with verb categorization. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 523–533.
[13]
Danqing Huang, Jing Liu, Chin-Yew Lin, and Jian Yin. 2018. Neural math word problem solver with reinforcement learning. In Proceedings of the 27th International Conference on Computational Linguistics. 213–223.
[14]
Danqing Huang, Jin-Ge Yao, Chin-Yew Lin, Qingyu Zhou, and Jian Yin. 2018. Using intermediate representations to solve math word problems. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 419–428.
[15]
Diederik P Kingma JLB. 2015. Adam: A method for stochastic optimization. In 3rd international conference for learning representations, San Diego.
[16]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980(2015).
[17]
Wang Ling, Dani Yogatama, Chris Dyer, and Phil Blunsom. 2017. Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 158–167.
[18]
Thang Luong, Ilya Sutskever, Quoc Le, Oriol Vinyals, and Wojciech Zaremba. 2015. Addressing the Rare Word Problem in Neural Machine Translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Beijing, China, 11–19. https://doi.org/10.3115/v1/P15-1002
[19]
Subhro Roy, Tim Vieira, and Dan Roth. 2015. Reasoning about quantities in natural language. Transactions of the Association for Computational Linguistics 3 (2015), 1–13.
[20]
Sam Wiseman and Alexander M. Rush. 2016. Sequence-to-Sequence Learning as Beam-Search Optimization. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, Texas, 1296–1306. https://doi.org/10.18653/v1/D16-1137

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WI-IAT '21: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology
December 2021
698 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 April 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data augmentation
  2. math word problem solving
  3. recycling

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Conference

WI-IAT '21
Sponsor:
WI-IAT '21: IEEE/WIC/ACM International Conference on Web Intelligence
December 14 - 17, 2021
VIC, Melbourne, Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 50
    Total Downloads
  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media