Abstract
In this paper, we present an approach for adapting the sequencing of learning activities that relies on the Q-learning, a reinforcement learning algorithm. The Q-learning learns a sequencing policy to select learning activities that improves the knowledge states of students.
In this research, we rely on the student knowledge state inferred by the Bayesian Knowledge Tracing (BKT) at every testing activity to calculate the reward of the Q-Learning. The more the Q-Learning decision improves the student knowledge state the greater the reward received by the Q-Learning. In addition, we propose a 3-step method aiming to ensure that the use of the Q-Learning is education domain compliant. It consists on training the Q-Learning first on simulated students to answer the “cold start” problem of the Q-Learning.
We present empirical results showing that the sequencing policy resulting from the 3-step method provides the ITS with an efficient strategy to improve the students’ knowledge states.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aleven, V., et al.: Instruction based on adaptive learning technologies. Handbook of Research on Learning and Instruction, pp. 522–560 (2016)
Bassen, J., et al.: Reinforcement learning for the adaptive scheduling of educational activities. In: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (2020)
Corbett, A.T., Anderson, J.R.: Knowledge tracing: modeling the acquisition of procedural knowledge. User Model. User-Adap. Inter. 4(4), 253–278 (1994). https://doi.org/10.1007/BF01099821
Doroudi, S., et al.: Sequence matters but how exactly? a method for evaluating activity sequences from data. In: Grantee Submission (2016)
Doroudi, S., Aleven, V., Brunskill, E.: Where’s the reward? Int. J. Artif. Intell. Educ. 29(4), 568–620 (2019). https://doi.org/10.1007/s40593-019-00187-x
Efremov, A., Ghosh, A., Singla, A.: Zero-shot learning of hint policy via reinforcement learning and program synthesis. In: International Educational Data Mining Society (2020)
Mandel, T., et al.: Offline policy evaluation across representations with applications to educational games. In: AAMAS, vol. 1077 (2014)
Watkins, C.J.C.H.: Learning from delayed rewards (1989)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yessad, A. (2023). Combining Learner Model and Reinforcement Learning for Adaptive Sequencing of Learning Activities. In: Temperini, M., et al. Methodologies and Intelligent Systems for Technology Enhanced Learning, 12th International Conference. MIS4TEL 2022. Lecture Notes in Networks and Systems, vol 580. Springer, Cham. https://doi.org/10.1007/978-3-031-20617-7_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-20617-7_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20616-0
Online ISBN: 978-3-031-20617-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)