skip to main content
10.1145/3583780.3615060acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Simulating Student Interactions with Two-stage Imitation Learning for Intelligent Educational Systems

Published: 21 October 2023 Publication History

Abstract

The fundamental task of intelligent educational systems is to offer adaptive learning services to students, such as exercise recommendations and computerized adaptive testing. However, optimizing required models in these systems would always encounter the collection difficulty of high-quality interaction data in practice. Therefore, establishing a student simulator is of great value since it can generate valid interactions to help optimize models. Existing advances have achieved success but generally suffer from exposure bias and overlook long-term intentions. To tackle these problems, we propose a novel Direct-Adversarial Imitation Student Simulator (DAISim) by formulating it as a Markov Decision Process (MDP), which unifies the workflow of the simulator in training and generating to alleviate the exposure bias and single-step optimization problems. To construct the intentions underlying the complex student interactions, we first propose a direct imitation strategy to mimic the interactions with a simple reward function. Then, we propose an adversarial imitation strategy to learn a rational distribution with the reward given by a parameterized discriminator. Furthermore, we optimize the discriminator in adversarial imitation in a pairwise manner, and the theoretical analysis shows that the pairwise discriminator would improve the generation quality. We conduct extensive experiments on real-world datasets, where the results demonstrate that our DAISim can simulate high-quality student interactions whose distribution is close to real distribution and can promote several downstream services.

References

[1]
Xueying Bai, Jian Guan, and Hongning Wang. 2019. A model-based reinforcement learning with adversarial training for online recommendation. Advances in Neural Information Processing Systems, Vol. 32 (2019).
[2]
Samy Bengio, Oriol Vinyals, Navdeep Jaitly, and Noam Shazeer. 2015. Scheduled sampling for sequence prediction with recurrent neural networks. Advances in neural information processing systems, Vol. 28 (2015).
[3]
Haoyang Bi, Haiping Ma, Zhenya Huang, Yu Yin, Qi Liu, Enhong Chen, Yu Su, and Shijin Wang. 2020. Quality meets diversity: A model-agnostic framework for computerized adaptive testing. In 2020 IEEE International Conference on Data Mining (ICDM). IEEE, 42--51.
[4]
Christopher M Bishop and Nasser M Nasrabadi. 2006. Pattern recognition and machine learning. Vol. 4. Springer.
[5]
Hua-Hua Chang and Zhiliang Ying. 1996. A Global Information Approach to Computerized Adaptive Testing. Applied Psychological Measurement, Vol. 20 (1996), 213 -- 229.
[6]
Mingzhi Chen, Quanlong Guan, Yizhou He, Zhenyu He, Liangda Fang, and Weiqi Luo. 2022. Knowledge Tracing Model with Learning and Forgetting Behavior. Proceedings of the 31st ACM International Conference on Information & Knowledge Management (2022).
[7]
Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014).
[8]
Youngduck Choi, Youngnam Lee, Dongmin Shin, Junghyun Cho, Seoyon Park, Seewoo Lee, Jineon Baek, Chan Bae, Byungsoo Kim, and Jaewe Heo. 2020. Ednet: A large-scale hierarchical dataset in education. In International Conference on Artificial Intelligence in Education. Springer, 69--73.
[9]
Hui-Chun Chu. 2014. Potential negative effects of mobile learning on students' learning achievement and cognitive load-A format assessment perspective. Journal of Educational Technology & Society, Vol. 17, 1 (2014), 332--344.
[10]
Kamil Ciosek. 2022. Imitation Learning by Reinforcement Learning. In International Conference on Learning Representations. https://openreview.net/forum?id=1zwleytEpYx
[11]
Xinyi Dai, Jianghao Lin, Weinan Zhang, Shuai Li, Weiwen Liu, Ruiming Tang, Xiuqiang He, Jianye Hao, Jun Wang, and Yong Yu. 2021. An Adversarial Imitation Click Model for Information Retrieval. In Proceedings of the Web Conference 2021. 1809--1820.
[12]
Chao Feng, Defu Lian, Xiting Wang, Zheng Liu, Xing Xie, and Enhong Chen. 2023. Reinforcement Routing on Proximity Graph for Efficient Recommendation. ACM Transactions on Information Systems, Vol. 41, 1 (2023), 1--27.
[13]
Aritra Ghosh, Neil Heffernan, and Andrew S Lan. 2020. Context-Aware Attentive Knowledge Tracing. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.
[14]
Aritra Ghosh and Andrew Lan. 2021. BOBCAT: Bilevel Optimization-Based Computerized Adaptive Testing. In International Joint Conference on Artificial Intelligence.
[15]
Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. Advances in neural information processing systems, Vol. 29 (2016).
[16]
Zhenya Huang, Qi Liu, Yuying Chen, Le Wu, Keli Xiao, Enhong Chen, Haiping Ma, and Guoping Hu. 2020. Learning or forgetting? a dynamic approach for tracking the knowledge proficiency of students. ACM Transactions on Information Systems (TOIS), Vol. 38, 2 (2020), 1--33.
[17]
Zhenya Huang, Qi Liu, Chengxiang Zhai, Yu Yin, Enhong Chen, Weibo Gao, and Guoping Hu. 2019. Exploring multi-objective exercise recommendations in online education systems. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 1261--1270.
[18]
Ahmed Hussein, Mohamed Medhat Gaber, Eyad Elyan, and Chrisina Jayne. 2017. Imitation learning: A survey of learning methods. ACM Computing Surveys (CSUR), Vol. 50, 2 (2017), 1--35.
[19]
Alexia Jolicoeur-Martineau. 2019. The relativistic discriminator: a key element missing from standard GAN. In International Conference on Learning Representations. https://openreview.net/forum?id=S1erHoR5t7
[20]
Alexia Jolicoeur-Martineau. 2020. On relativistic f-divergences. In International Conference on Machine Learning. PMLR, 4931--4939.
[21]
Minsam Kim, Yugeun Shim, Seewoo Lee, Hyunbin Loh, and Juneyoung Park. 2021. Behavioral Testing of Deep Neural Network Knowledge Tracing Models. International Educational Data Mining Society (2021).
[22]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[23]
René F Kizilcec and Hansol Lee. 2020. Algorithmic fairness in education. arXiv preprint arXiv:2007.05443 (2020).
[24]
Karol Kurach, Mario Lucic, Xiaohua Zhai, Marcin Michalski, and Sylvain Gelly. 2019. The GAN Landscape: Losses, Architectures, Regularization, and Normalization. https://openreview.net/forum?id=rkGG6s0qKQ
[25]
Yu Lei, Hongbin Pei, Hanqi Yan, and Wenjie Li. 2020. Reinforcement learning based recommendation with graph convolutional q-network. In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval. 1757--1760.
[26]
Qi Liu, Zhenya Huang, Yu Yin, Enhong Chen, Hui Xiong, Yu Su, and Guoping Hu. 2019a. Ekt: Exercise-aware knowledge tracing for student performance prediction. IEEE Transactions on Knowledge and Data Engineering, Vol. 33, 1 (2019), 100--115.
[27]
Qi Liu, Shiwei Tong, Chuanren Liu, Hongke Zhao, Enhong Chen, Haiping Ma, and Shijin Wang. 2019b. Exploiting cognitive structure for adaptive learning. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 627--635.
[28]
Frederic M. Lord. 1980. Applications of Item Response Theory To Practical Testing Problems.
[29]
Yu Lu, Yang Pian, Penghe Chen, Qinggang Meng, and Yunbo Cao. 2021. RadarMath: An Intelligent Tutoring System for Math Education. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 16087--16090.
[30]
Shalini Pandey and George Karypis. 2019. A self-attentive model for knowledge tracing. arXiv preprint arXiv:1907.06837 (2019).
[31]
Chris Piech, Jonathan Bassen, Jonathan Huang, Surya Ganguli, Mehran Sahami, Leonidas J Guibas, and Jascha Sohl-Dickstein. 2015. Deep knowledge tracing. Advances in neural information processing systems, Vol. 28 (2015).
[32]
Siddharth Reddy, Anca D. Dragan, and Sergey Levine. 2020. SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards. In International Conference on Learning Representations. https://openreview.net/forum?id=S1xKd24twB
[33]
Siddharth Reddy, Sergey Levine, and Anca Dragan. 2017. Accelerating human learning with deep reinforcement learning. In NIPS'17 Workshop: Teaching Machines, Robots, and Humans. 5--9.
[34]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).
[35]
Shuanghong Shen, Zhenya Huang, Qi Liu, Yu Su, Shijin Wang, and Enhong Chen. 2022. Assessing Student's Dynamic Knowledge State by Exploring the Question Difficulty Effect. Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (2022).
[36]
Jing-Cheng Shi, Yang Yu, Qing Da, Shi-Yong Chen, and An-Xiang Zeng. 2019. Virtual-taobao: Virtualizing real-world online retail environment for reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 4902--4909.
[37]
Jill-Jênn Vie, Fabrice Popineau, Éric Bruillard, and Yolaine Bourda. 2017. A review of recent advances in adaptive assessment. Learning analytics: fundaments, applications, and trends (2017), 113--142.
[38]
Ruohan Wang, Carlo Ciliberto, Pierluigi Vito Amadori, and Yiannis Demiris. 2019. Random expert distillation: Imitation learning via expert policy support estimation. In International Conference on Machine Learning. PMLR, 6536--6544.
[39]
Qize Xie, Liping Wang, Peidong Song, and Xuemin Lin. 2021. SQKT: A Student Attention-Based and Question-Aware Model for Knowledge Tracing. In Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data. Springer, 221--236.
[40]
Chun-Kit Yeung and Dit-Yan Yeung. 2018. Addressing two problems in deep knowledge tracing via prediction-consistent regularization. In Proceedings of the Fifth Annual ACM Conference on Learning at Scale. 1--10.
[41]
Jifan Yu, Yuquan Wang, Qingyang Zhong, Gan Luo, Yiming Mao, Kai Sun, Wenzheng Feng, Wei Xu, Shulin Cao, Kaisheng Zeng, et al. 2021. MOOCCubeX: a large knowledge-centered repository for adaptive learning in MOOCs. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 4643--4652.
[42]
Jiani Zhang, Xingjian Shi, Irwin King, and Dit-Yan Yeung. 2017. Dynamic key-value memory networks for knowledge tracing. In Proceedings of the 26th international conference on World Wide Web. 765--774.
[43]
Moyu Zhang, Xinning Zhu, Chunhong Zhang, Yang Ji, Feng Pan, and Changchuan Yin. 2021. Multi-factors aware dual-attentional knowledge tracing. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 2588--2597.
[44]
Junbo Zhao, Yoon Kim, Kelly Zhang, Alexander Rush, and Yann LeCun. 2018a. Adversarially regularized autoencoders. In International conference on machine learning. PMLR, 5902--5911.
[45]
Wayne Xin Zhao, Wenhui Zhang, Yulan He, Xing Xie, and Ji-Rong Wen. 2018b. Automatically learning topics and difficulty levels of problems in online judge systems. ACM Transactions on Information Systems (TOIS), Vol. 36, 3 (2018), 1--33.
[46]
Xiangyu Zhao, Long Xia, Lixin Zou, Hui Liu, Dawei Yin, and Jiliang Tang. 2021. Usersim: User simulation via supervised generativeadversarial network. In Proceedings of the Web Conference 2021. 3582--3589.
[47]
Yan Zhuang, Qi Liu, Zhenya Huang, Zhi Li, Shuanghong Shen, and Haiping Ma. 2022. Fully Adaptive Framework: Neural Computerized Adaptive Testing for Online Education. In AAAI, Vol. 36. 4734--4742.

Cited By

View all
  • (2024)DRL-SRS: A Deep Reinforcement Learning Approach for Optimizing Spaced Repetition SchedulingApplied Sciences10.3390/app1413559114:13(5591)Online publication date: 27-Jun-2024
  • (2024)Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic HashingProceedings of the ACM Web Conference 202410.1145/3589334.3645440(1395-1406)Online publication date: 13-May-2024
  • (2024)A Review of Data Mining in Personalized Education: Current Trends and Future ProspectsFrontiers of Digital Education10.1007/s44366-024-0019-61:1(26-50)Online publication date: 2-Jul-2024

Index Terms

  1. Simulating Student Interactions with Two-stage Imitation Learning for Intelligent Educational Systems

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
      October 2023
      5508 pages
      ISBN:9798400701245
      DOI:10.1145/3583780
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 21 October 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. adaptive learning
      2. imitation learning
      3. student simulator

      Qualifiers

      • Research-article

      Funding Sources

      • the National Key Research and Development Program of China
      • the National Natural Science Foundation of China
      • the University Synergy Innovation Program of Anhui Province

      Conference

      CIKM '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)108
      • Downloads (Last 6 weeks)16
      Reflects downloads up to 17 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)DRL-SRS: A Deep Reinforcement Learning Approach for Optimizing Spaced Repetition SchedulingApplied Sciences10.3390/app1413559114:13(5591)Online publication date: 27-Jun-2024
      • (2024)Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic HashingProceedings of the ACM Web Conference 202410.1145/3589334.3645440(1395-1406)Online publication date: 13-May-2024
      • (2024)A Review of Data Mining in Personalized Education: Current Trends and Future ProspectsFrontiers of Digital Education10.1007/s44366-024-0019-61:1(26-50)Online publication date: 2-Jul-2024

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media