skip to main content
10.1145/3340631.3394848acmconferencesArticle/Chapter ViewAbstractPublication PagesumapConference Proceedingsconference-collections
research-article

Improving Student-System Interaction Through Data-driven Explanations of Hierarchical Reinforcement Learning Induced Pedagogical Policies

Published: 13 July 2020 Publication History

Abstract

Motivated by the recent advances of reinforcement learning and the traditional grounded Self Determination Theory (SDT), we explored the impact of hierarchical reinforcement learning (HRL) induced pedagogical policies and data-driven explanations of the HRL-induced policies on student experience in an Intelligent Tutoring System (ITS). We explored their impacts first independently and then jointly. Overall our results showed that 1) the HRL induced policies could significantly improve students' learning performance, and 2) explaining the tutor's decisions to students through data-driven explanations could improve the student-system interaction in terms of students' engagement and autonomy.

Supplementary Material

VTT File (3340631.3394848.vtt)
MP4 File (3340631.3394848.mp4)
Supplemental Video

References

[1]
John R Anderson, Albert T Corbett, Kenneth R Koedinger, and Ray Pelletier. 1995. Cognitive tutors: Lessons learned. The journal of the learning sciences, Vol. 4, 2 (1995), 167--207.
[2]
Ryan Shaun Baker, Albert T Corbett, Kenneth R Koedinger, and Angela Z Wagner. 2004. Off-task behavior in the cognitive tutor classroom: when students" game the system". In Proceedings of the SIGCHI conference on Human factors in computing systems. 383--390.
[3]
Andrew G Barto and Sridhar Mahadevan. 2003. Recent advances in hierarchical reinforcement learning. Discrete event dynamic systems, Vol. 13, 1--2 (2003), 41--77.
[4]
Joseph Beck, Beverly Park Woolf, and Carole R Beal. 2000. ADVISOR: A machine learning architecture for intelligent tutor construction. AAAI/IAAI, Vol. 2000, 552--557 (2000), 1--2.
[5]
Min Chi, Kurt VanLehn, Diane Litman, and Pamela Jordan. 2011a. Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies. User Modeling and User-Adapted Interaction, Vol. 21, 1--2 (2011), 137--180.
[6]
Min Chi, Kurt VanLehn, Diane Litman, and Pamela Jordan. 2011b. An evaluation of pedagogical tutorial tactics for a natural language tutoring system: A reinforcement learning approach. International Journal of Artificial Intelligence in Education, Vol. 21, 1--2 (2011), 83--113.
[7]
Diana I Cordova and Mark R Lepper. 1996. Intrinsic motivation and the process of learning: Beneficial effects of contextualization, personalization, and choice. Journal of educational psychology, Vol. 88, 4 (1996), 715.
[8]
Heriberto Cuayáhuitl et al. 2010. Generating adaptive route instructions using hierarchical reinforcement learning. In International Conference on Spatial Cognition. Springer, 319--334.
[9]
Edward L Deci, Haleh Eghrari, Brian C Patrick, and Dean R Leone. 1994. Facilitating internalization: The self-determination theory perspective. Journal of personality, Vol. 62, 1 (1994), 119--142.
[10]
Shayan Doroudi, Kenneth Holstein, Vincent Aleven, and Emma Brunskill. 2015. Towards Understanding How to Leverage Sense-Making, Induction and Refinement, and Fluency to Improve Robust Learning. International Educational Data Mining Society (2015).
[11]
Martha Evens and Joel Michael. 2006. One-on-one tutoring by humans and computers .Psychology Press.
[12]
Daniel Hein, Steffen Udluft, and Thomas A Runkler. 2018. Interpretable policies for reinforcement learning by genetic programming. Engineering Applications of Artificial Intelligence, Vol. 76 (2018), 158--169.
[13]
Ana Iglesias, Paloma Mart'inez, Ricardo Aler, and Fernando Fernández. 2009. Reinforcement learning of pedagogical policies in adaptive and intelligent educational systems. Knowledge-Based Systems, Vol. 22, 4 (2009), 266--270.
[14]
Hyungshim Jang. 2008. Supporting students' motivation, engagement, and learning during an uninteresting activity. Journal of Educational Psychology, Vol. 100, 4 (2008), 798.
[15]
Mable B Kinzie and Howard J Sullivan. 1989. Continuing motivation, learner control, and CAI. Educational Technology Research and Development, Vol. 37, 2 (1989), 5--14.
[16]
Kenneth R Koedinger, John R Anderson, William H Hadley, and Mary A Mark. 1997. Intelligent tutoring goes to school in the big city. (1997).
[17]
Kenneth R Koedinger, Emma Brunskill, Ryan SJd Baker, Elizabeth A McLaughlin, and John Stamper. 2013. New potentials for data-driven intelligent tutoring system development and optimization. AI Magazine, Vol. 34, 3 (2013), 27--41.
[18]
Alfie Kohn. 1993. Choices for children. Phi Delta Kappan, Vol. 75, 1 (1993), 8--20.
[19]
Tejas D Kulkarni, Karthik Narasimhan, Ardavan Saeedi, and Josh Tenenbaum. 2016. Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation. In Advances in neural information processing systems. 3675--3683.
[20]
Mark R Lepper, Maria Woolverton, Donna L Mumme, and J Gurtner. 1993. Motivational techniques of expert human tutors: Lessons for the design of computer-based tutors. Computers as cognitive tools, Vol. 1993 (1993), 75--105.
[21]
Francis Maes, Raphael Fonteneau, Louis Wehenkel, and Damien Ernst. 2012. Policy search in a space of simple closed-form formulas: Towards interpretability of reinforcement learning. In International Conference on Discovery Science. Springer, 37--51.
[22]
Travis Mandel, Yun-En Liu, Sergey Levine, Emma Brunskill, and Zoran Popovic. 2014. Offline policy evaluation across representations with applications to educational games. In Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems. International Foundation for Autonomous Agents and Multiagent Systems, 1077--1084.
[23]
Bruce M McLaren and Seiji Isotani. 2011. When is it best to learn with all worked examples?. In International Conference on Artificial Intelligence in Education. Springer, 222--229.
[24]
Bruce M McLaren, Sung-Joo Lim, and Kenneth R Koedinger. 2008. When and how often should worked examples be given to students? New results and a summary of the current state of research. In Proceedings of the 30th annual conference of the cognitive science society. 2176--2181.
[25]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. 2011. Scikit-learn: Machine learning in Python. Journal of machine learning research, Vol. 12, Oct (2011), 2825--2830.
[26]
Xue Bin Peng, Glen Berseth, KangKang Yin, and Michiel Van De Panne. 2017. Deeploco: Dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Transactions on Graphics (TOG), Vol. 36, 4 (2017), 41.
[27]
Pipatsarun Phobun and Jiracha Vicheanpanya. 2010. Adaptive intelligent tutoring systems for e-learning systems. Procedia - Social and Behavioral Sciences, Vol. 2, 2 (2010), 4064 -- 4069. https://doi.org/ Innovation and Creativity in Education.
[28]
Anna N Rafferty, Emma Brunskill, Thomas L Griffiths, and Patrick Shafto. 2016. Faster teaching via pomdp planning. Cognitive science, Vol. 40, 6 (2016), 1290--1332.
[29]
Carl Edward Rasmussen. 2004. Gaussian processes in machine learning. In Advanced lectures on machine learning. Springer, 63--71.
[30]
Johnmarshall Reeve, Hyungshim Jang, Pat Hardre, and Mafumi Omura. 2002. Providing a rationale in an autonomy-supportive way as a strategy to motivate others during an uninteresting activity. Motivation and emotion, Vol. 26, 3 (2002), 183--207.
[31]
Jonathan Rowe, Bradford Mott, and James Lester. 2014. Optimizing player experience in interactive narrative planning: a modular reinforcement learning approach. In Tenth Artificial Intelligence and Interactive Digital Entertainment Conference .
[32]
Jonathan P Rowe and James C Lester. 2015. Improving student problem solving in narrative-centered learning environments: A modular reinforcement learning framework. In International Conference on Artificial Intelligence in Education. Springer, 419--428.
[33]
Malcolm Ryan and Mark Reid. 2000. Learning to fly: An application of hierarchical reinforcement learning. In In Proceedings of the 17th International Conference on Machine Learning. Citeseer.
[34]
Ron JCM Salden, Vincent Aleven, Rolf Schwonke, and Alexander Renkl. 2010. The expertise reversal effect and worked examples in tutored problem solving. Instructional Science, Vol. 38, 3 (2010), 289--307.
[35]
Carol Sansone, Charlene Weir, Lora Harpster, and Carolyn Morgan. 1992. Once a boring task always a boring task? Interest as a self-regulatory mechanism. Journal of personality and social psychology, Vol. 63, 3 (1992), 379.
[36]
Carol Sansone, Deborah J Wiebe, and Carolyn Morgan. 1999. Self-regulating interest: The moderating role of hardiness and conscientiousness. Journal of personality, Vol. 67, 4 (1999), 701--733.
[37]
Gregory Schraw, Terri Flowerday, and Marcy F Reisetter. 1998. The role of choice in reader engagement. Journal of Educational Psychology, Vol. 90, 4 (1998), 705.
[38]
Devin Schwab and Soumya Ray. 2017. Offline reinforcement learning with task hierarchies. Machine Learning, Vol. 106, 9--10 (2017), 1569--1598.
[39]
Rolf Schwonke, Alexander Renkl, Carmen Krieg, Jörg Wittwer, Vincent Aleven, and Ron Salden. 2009. The worked-example effect: Not an artefact of lousy control conditions. Computers in Human Behavior, Vol. 25, 2 (2009), 258--266.
[40]
Shitian Shen and Min Chi. 2016a. Aim Low: Correlation-Based Feature Selection for Model-Based Reinforcement Learning. International Educational Data Mining Society (2016).
[41]
Shitian Shen and Min Chi. 2016b. Reinforcement Learning: the Sooner the Better, or the Later the Better?. In Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization. ACM, 37--44.
[42]
Hsin-Yih Shyu and Scott W Brown. 1992. Learner control versus program control in interactive videodisc instruction: What are the effects in procedural learning. International Journal of Instructional Media, Vol. 19, 2 (1992), 85--95.
[43]
Richard S Sutton, Doina Precup, and Satinder Singh. 1999. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, Vol. 112, 1--2 (1999), 181--211.
[44]
Kurt Vanlehn. 2006. The behavior of tutoring systems. International journal of artificial intelligence in education, Vol. 16, 3 (2006), 227--265.
[45]
Xin Wang, Wenhu Chen, Jiawei Wu, Yuan-Fang Wang, and William Yang Wang. 2018. Video captioning via hierarchical reinforcement learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4213--4222.
[46]
Shiou-Wen Yeh and James D Lehman. 2001. Effects of learner control and learning strategies on English as a foreign language (EFL) learning from interactive hypermedia lessons. Journal of Educational Multimedia and Hypermedia, Vol. 10, 2 (2001), 141--159.
[47]
Guojing Zhou, Hamoon Azizsoltani, Markel Sanz Ausin, Tiffany Barnes, and Min Chi. 2019 a. Hierarchical reinforcement learning for pedagogical policy induction. In Artificial Intelligence in Education - 20th International Conference, AIED 2019, Chicago, IL, USA, June 25--29, 2019, Proceedings, Part I. Springer, 544--556.
[48]
Guojing Zhou, Thomas W Price, Collin Lynch, Tiffany Barnes, and Min Chi. 2015. The Impact of Granularity on Worked Examples and Problem Solving. In Proceedings of the 37th annual conference of the cognitive science society. 2817--2822.
[49]
Guojing Zhou, Jianxun Wang, Collin Lynch, and Min Chi. 2017. Towards Closing the Loop: Bridging Machine-induced Pedagogical Policies to Learning Theories. In EDM .
[50]
Guojing Zhou, Xi Yang, and Min Chi. 2019 b. Big, Little, or Both? Exploring the Impact of Granularity on Learning for Students with Different Incoming Competence. In Proceedings of the 41th annual conference of the cognitive science society. 3206--3212.

Cited By

View all
  • (2024)Online Learning Strategy Induction through Partially Observable Markov Decision Process-Based Cognitive Experience ModelElectronics10.3390/electronics1319385813:19(3858)Online publication date: 29-Sep-2024
  • (2023)HOPE: Human-Centric Off-Policy Evaluation for E-Learning and HealthcareProceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems10.5555/3545946.3598804(1504-1513)Online publication date: 30-May-2023
  • (2023)Challenging social media threats using collective well-being-aware recommendation algorithms and an educational virtual companionFrontiers in Artificial Intelligence10.3389/frai.2022.6549305Online publication date: 9-Jan-2023
  • Show More Cited By

Index Terms

  1. Improving Student-System Interaction Through Data-driven Explanations of Hierarchical Reinforcement Learning Induced Pedagogical Policies

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    UMAP '20: Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization
    July 2020
    426 pages
    ISBN:9781450368612
    DOI:10.1145/3340631
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 July 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. data-driven explanations
    2. hierarchical reinforcement learning
    3. intelligent tutoring system
    4. pedagogical policy

    Qualifiers

    • Research-article

    Funding Sources

    • National Science Foundation

    Conference

    UMAP '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 162 of 633 submissions, 26%

    Upcoming Conference

    UMAP '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)22
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Online Learning Strategy Induction through Partially Observable Markov Decision Process-Based Cognitive Experience ModelElectronics10.3390/electronics1319385813:19(3858)Online publication date: 29-Sep-2024
    • (2023)HOPE: Human-Centric Off-Policy Evaluation for E-Learning and HealthcareProceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems10.5555/3545946.3598804(1504-1513)Online publication date: 30-May-2023
    • (2023)Challenging social media threats using collective well-being-aware recommendation algorithms and an educational virtual companionFrontiers in Artificial Intelligence10.3389/frai.2022.6549305Online publication date: 9-Jan-2023
    • (2023)Improving Knowledge Learning Through Modelling Students’ Practice-Based Cognitive ProcessesCognitive Computation10.1007/s12559-023-10201-z16:1(348-365)Online publication date: 29-Sep-2023
    • (2022)Student-Tutor Mixed-Initiative Decision-Making Supported by Deep Reinforcement LearningArtificial Intelligence in Education10.1007/978-3-031-11644-5_36(440-452)Online publication date: 27-Jul-2022
    • (2021)Toward personalized XAIArtificial Intelligence10.1016/j.artint.2021.103503298:COnline publication date: 1-Sep-2021
    • (2021)Leveraging Granularity: Hierarchical Reinforcement Learning for Pedagogical Policy InductionInternational Journal of Artificial Intelligence in Education10.1007/s40593-021-00269-932:2(454-500)Online publication date: 16-Aug-2021
    • (2021)Explainable Recommendations in a Personalized Programming Practice SystemArtificial Intelligence in Education10.1007/978-3-030-78292-4_6(64-76)Online publication date: 14-Jun-2021
    • (2021)Evaluating Critical Reinforcement Learning Framework in the FieldArtificial Intelligence in Education10.1007/978-3-030-78292-4_18(215-227)Online publication date: 14-Jun-2021
    • (2021)Adaptively Scaffolding Cognitive Engagement with Batch Constrained Deep Q-NetworksArtificial Intelligence in Education10.1007/978-3-030-78292-4_10(113-124)Online publication date: 14-Jun-2021

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media