Abstract
In this study, we propose a novel model called template-based multitask generation (TM-generation) that can improve the problem-solving accuracy of mathematical word problem-solving task. In automatic mathematical word problem-solving task, a machine learning model should deduce an answer to a given problem by acquiring implied numeric information. To build a robust model that can sufficiently utilize numeric information to solve various mathematical word problems, such a model should address two challenges: (1) filling in missing world knowledge required to solve the given mathematical word problem, and (2) understanding the implied relationship between numbers and variables. To address these two challenges, we propose template-based multitask generation (TM-generation). To address challenge (1), we utilize the state-of-the-art language models called ELECTRA. To address challenge (2), we propose an operator identification layer that models the relationship between numbers and variables. Our experimental results show that using the MAWPS and Math23k datasets, state-of-the-art performance was achieved: 85.2% in MAWPS and 85.3% in Math23k.
Similar content being viewed by others
References
Clark K, Luong MT, Le QV, Manning CD (2020) ELECTRA: pre-training text encoders as discriminators rather than generators. In: Proceedings of the International Conference on Learning Representations (ICLR 2020). https://openreview.net/pdf?id=r1xMH1BtvB
Bobrow DG (1964) Natural language input for a computer problem solving system. Technical report. USA
Charniak E (1969) Computer solution of calculus word problems. In: Proceedings of the 1st International Joint Conference on Artificial Intelligence. pp 303–316
Wang L, Wang Y, Cai D, Zhang D, Liu X (2018) Translating a math word problem to a expression tree. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Brussels, Belgium, pp 1064–1069. https://doi.org/10.18653/v1/D18-1132
Zhang D, Wang L, Zhang L, Dai BT, Shen HT (2019) The gap of semantic parsing: a survey on automatic math word problem solvers. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2019.2914054
Liu Q, Guan W, Li S, Kawahara D (2019) Tree-structured decoding for solving math word problems. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP), pp 2370–2379
Wang L, Zhang D, Zhang J, Xu X, Gao L, Dai BT, Shen HT (2019) Template-based math word problem solvers with recursive neural networks. Proc AAAI Conf Artif Intell 33:7144–7151
Kushman N, Artzi Y, Zettlemoyer L, Barzilay R (2014) Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 1: 271–281
Lee D, Gweon G (2020) Solving arithmetic word problems with a template based multi-task deep neural network (t-mtdnn). In: 2020 IEEE International Conference on Big Data and Smart Computing (BigComp). IEEE, pp 271–274
Koncel-Kedziorski R, Roy S, Amini A, Kushman N, Hajishirzi H (2016) MAWPS: a math word problem repository. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, San Diego, California. pp 1152–1157, https://doi.org/10.18653/v1/N16-1136
Wang Y, Liu X, Shi S (2017) Deep neural solver for math word problems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. pp 845–854
Huang D, Shi S, Lin CY, Yin J, Ma WY (2016) How well do computers solve math word problems? Large-scale dataset construction and evaluation. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 1: 887–896
Roy S, Roth D (2015) Solving general arithmetic word problems. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. pp 1743–1752
Zhou L, Dai S, Chen L (2015) Learn to solve algebra word problems using quadratic programming. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. pp 817–822
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:190711692
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, Minneapolis, Minnesota. 1: 4171–4186. https://doi.org/10.18653/v1/N191423
Ki K, Lee D, Gweon G (2020) Kotab: Korean template-based arithmetic solver with bert. In: 2020 IEEE International Conference on Big Data and Smart Computing (BigComp), IEEE. pp 279–282
Wallace E, Wang Y, Li S, Singh S, Gardner M (2019) Do NLP models know numbers? Probing numeracy in embeddings. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China. pp 5307–5315. DOI https://doi.org/10.18653/v1/D19-1534
Griffith K, Kalita J (2019) Solving arithmetic word problems automatically using transformer and unambiguous representations. In: 2019 International Conference on Computational Science and Computational Intelligence (CSCI), IEEE. pp 526–532
Hosseini MJ, Hajishirzi H, Etzioni O, Kushman N (2014) Learning to solve arithmetic word problems with verb categorization. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). pp 523–533
Roy S, Roth D (2018) Mapping to declarative knowledge for word problem solving. Trans Assoc Comput Linguist 6:159–172
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst. 5998–6008
Zhang J, Lee RKW, Lim EP, Qin W, Wang L, Shao J, SunQ (2020) Teacher-student networks with multiple decoders for solving math word problem. In: Bessiere C (ed) Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, International Joint Conferences on Artificial Intelligence Organization. 4011–4017. https://doi.org/10.24963/ijcai.2020/555
Meurer A, Smith CP, Paprocki M, Čertík O, Kirpichev SB, Rocklin M, Kumar A, Ivanov S, Moore JK, Singh S, Rathnayake T, Vig S, Granger BE, Muller RP, Bonazzi F, Gupta H, Vats S, Johansson F, Pedregosa F, Curry MJ, Terrel AR, Roučka V, Saboo A, Fernando I, Kulal S, Cimrman R, Scopatz A (2017) Sympy: symbolic computing in python. PeerJ Comput Sci 3:e103. https://doi.org/10.7717/peerj-cs.103
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) Pytorch: an imperative style, highperformance deep learning library. In: Wallach H, Larochelle H, Beygelzimer A, d’Alch ́e-Bu F, Fox E, Garnett R (eds) Advances in neural information systems. Curran Associates Inc., Berlin, pp 8024–8035
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: Proceedings of the International Conference on Learning Representations (ICLR 2015)
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2818–2826
Acknowledgements
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2020R1C1C1010162).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lee, D., Ki, K.S., Kim, B. et al. TM-generation model: a template-based method for automatically solving mathematical word problems. J Supercomput 77, 14583–14599 (2021). https://doi.org/10.1007/s11227-021-03855-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-021-03855-9