A deep reinforcement learning agent for geometry online tutoring

Xiao, Ziyang; Zhang, Dongxiang

doi:10.1007/s10115-022-01804-3

A deep reinforcement learning agent for geometry online tutoring

Regular Paper
Published: 19 December 2022

Volume 65, pages 1611–1625, (2023)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Ziyang Xiao¹ &
Dongxiang Zhang¹

314 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

In this paper, we apply deep reinforcement learning (DRL) for geometry reasoning and develop Dragon to facilitate online tutoring. Its success is contingent on a flexible data model to capture diverse concepts and heterogeneous relations, as well as an effective DRL agent to generate near-optimal and human-readable solutions. We use proximal policy optimization (PPO) as the backbone DRL architecture, customized with effective state representation and integrated with a bunch of optimization tricks including attention mechanism, action mask, data augmentation and curriculum learning. In our experimental study, we craft so far the largest scale dataset with geometry problems and a knowledge base with 46 theorems. We implement various heuristic algorithms and DRL models as baselines for performance comparison. The results show that our agent achieves near-optimal solution and is superior over multiple competitive baselines. To benefit the community, we opensource the dataset and implementation at https://github.com/AIEdu-xzy/geometry-solver.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reason more like human: Incorporating meta information into hierarchical reinforcement learning for knowledge graph reasoning

Article 08 October 2022

Oracle-SAGE: Planning Ahead in Graph-Based Deep Reinforcement Learning

Off-Policy Differentiable Logic Reinforcement Learning

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Chou S-C, Gao X-S, Zhang J (1994) Machine proofs in geometry: automated production of readable proofs for geometry theorems. World Scientific, Singapore. https://doi.org/10.1142/9789812798152
Article MATH Google Scholar
Alvin C, Gulwani S, Majumdar R, Mukhopadhyay S (2017) Synthesis of problems for shaded area geometry reasoning, pp. 455–458. https://doi.org/10.1007/978-3-319-61425-0_39
Alvin C, Gulwani S, Majumdar R, Mukhopadhyay S (2014) Synthesis of geometry proof problems. Proc Natl Conf Artif Intell 1:245–252
Google Scholar
Lan Y, Wang L, Zhang Q, Lan Y, Dai BT, Wang Y, Zhang D, Lim E (2022) Mwptoolkit: An open-source framework for deep learning-based math word problem solvers. In: Thirty-sixth aaai conference on artificial intelligence, aaai 2022, thirty-fourth conference on innovative applications of artificial intelligence, IAAI 2022, the twelveth symposium on educational advances in artificial intelligence, EAAI 2022 Virtual Event, February 22 - March 1, 2022, pp. 13188–13190
TUN W-D (1978) On the decision problem and the mechanization of theorem-proving in elementary geometry. Sci Sinica 21(2):159–172
MathSciNet Google Scholar
Wen-Tsun W (1986) Basic principles of mechanical theorem proving in elementary geometries. J Autom Reason 2(3):221–252
Article Google Scholar
Kapur D (1986) Using gröbner bases to reason about geometry problems. J Symb Comput 2(4):399–408
Article MATH Google Scholar
Seo MJ, Hajishirzi H, Farhadi A, Etzioni O (2014) Diagram understanding in geometry questions. In: AAAI, pp 2831–2838
Seo MJ, Hajishirzi H, Farhadi A, Etzioni O, Malcolm C (2015) Solving geometry problems: combining text and diagram interpretation. In: EMNLP, pp 1466–1476
Zhang D, Wang L, Zhang L, Dai BT, Shen HT (2020) The gap of semantic parsing: a survey on automatic math word problem solvers. IEEE Trans Pattern Anal Mach Intell 42(9):2287–2305
Article Google Scholar
Rocktäschel T, Riedel S (2017) End-to-end differentiable proving. In: NIPS, pp 3788–3800
Kaliszyk C, Urban J, Michalewski H, Olsák M (2018) Reinforcement learning of theorem proving. In: NeurIPS, pp 8836–8847
Letz R, Mayr K, Goller C (1994) Cotrolled integration of the cut rule into connection tableaux calculi. J Autom Reason 13(3):297–337
Article MATH Google Scholar
Selsam D, Lamm M, Bünz B, Liang P, de Moura L, Dill DL (2019) Learning a SAT solver from single-bit supervision. In: ICLR
Lederman G, Rabe MN, Seshia S, Lee EA (2020) Learning heuristics for quantified boolean formulas through reinforcement learning. In: ICLR
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. CoRR abs/1707.06347
Laskin M, Lee K, Stooke A, Pinto L, Abbeel P, Srinivas A (2020) Reinforcement learning with augmented data. CoRR abs/2004.14990
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS, pp 1106–1114
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR
Cubuk ED, Zoph B, Mane D, Vasudevan V, Le QV (2019) Autoaugment: Learning augmentation strategies from data. In: CVPR, pp 113–123
Bengio Y, Louradour J, Collobert R, Weston J (2009) Curriculum learning. In: Proceedings of the 26th annual international conference on machine learning, ICML 2009, Montreal, Quebec, Canada, June 14-18, 2009, pp 41–48
Jiang L, Zhou Z, Leung T, Li L, Fei-Fei L (2018) Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels. In: ICML, pp 2309–2318
Guo S, Huang W, Zhang H, Zhuang C, Dong D, Scott MR, Huang D (2018) Curriculumnet: Weakly supervised learning from large-scale web images. CoRR abs/1808.01097
Xu B, Zhang L, Mao Z, Wang Q, Xie H, Zhang Y (2020) Curriculum learning for natural language understanding. In: ACL, pp 6095–6104
Williams RJ (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach Learn 8:229–256
Article MATH Google Scholar
Wu S, Li Y, Zhu H, Zhao J, Chen G (2022) Dynamic index construction with deep reinforcement learning. Data Sci Eng 7(2):87–101
Article Google Scholar
Mnih V, Badia AP, Mirza M, Graves A, Lillicrap TP, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In: ICML. JMLR workshop and conference proceedings, vol. 48, pp 1928–1937
Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: AAAI, pp 2094–2100
Loshchilov I, Hutter F (2017) SGDR: stochastic gradient descent with warm restarts. In: ICLR

Download references

Acknowledgements

The work is supported by the National Key Research and Development Project of China (2022YFF0902000).

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

College of Computer Science and Technology, Zhejiang University, Hangzhou, Zhejiang, China
Ziyang Xiao & Dongxiang Zhang

Authors

Ziyang Xiao
View author publications
You can also search for this author inPubMed Google Scholar
Dongxiang Zhang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Dongxiang Zhang.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

See Fig. 7.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xiao, Z., Zhang, D. A deep reinforcement learning agent for geometry online tutoring. Knowl Inf Syst 65, 1611–1625 (2023). https://doi.org/10.1007/s10115-022-01804-3

Download citation

Received: 17 March 2022
Revised: 27 November 2022
Accepted: 03 December 2022
Published: 19 December 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s10115-022-01804-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A deep reinforcement learning agent for geometry online tutoring

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Reason more like human: Incorporating meta information into hierarchical reinforcement learning for knowledge graph reasoning

Oracle-SAGE: Planning Ahead in Graph-Based Deep Reinforcement Learning

Off-Policy Differentiable Logic Reinforcement Learning

Explore related subjects

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now