research-article

Improving Student-System Interaction Through Data-driven Explanations of Hierarchical Reinforcement Learning Induced Pedagogical Policies

Authors:

Hamoon Azizsoltani,

Tiffany Barnes,

Min ChiAuthors Info & Claims

UMAP '20: Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization

Pages 284 - 292

https://doi.org/10.1145/3340631.3394848

Published: 13 July 2020 Publication History

Abstract

Motivated by the recent advances of reinforcement learning and the traditional grounded Self Determination Theory (SDT), we explored the impact of hierarchical reinforcement learning (HRL) induced pedagogical policies and data-driven explanations of the HRL-induced policies on student experience in an Intelligent Tutoring System (ITS). We explored their impacts first independently and then jointly. Overall our results showed that 1) the HRL induced policies could significantly improve students' learning performance, and 2) explaining the tutor's decisions to students through data-driven explanations could improve the student-system interaction in terms of students' engagement and autonomy.

Supplementary Material

VTT File (3340631.3394848.vtt)

Download
18.67 KB

MP4 File (3340631.3394848.mp4)

Supplemental Video

Download
32.85 MB

References

[1]

John R Anderson, Albert T Corbett, Kenneth R Koedinger, and Ray Pelletier. 1995. Cognitive tutors: Lessons learned. The journal of the learning sciences, Vol. 4, 2 (1995), 167--207.

[2]

Ryan Shaun Baker, Albert T Corbett, Kenneth R Koedinger, and Angela Z Wagner. 2004. Off-task behavior in the cognitive tutor classroom: when students" game the system". In Proceedings of the SIGCHI conference on Human factors in computing systems. 383--390.

Digital Library

[3]

Andrew G Barto and Sridhar Mahadevan. 2003. Recent advances in hierarchical reinforcement learning. Discrete event dynamic systems, Vol. 13, 1--2 (2003), 41--77.

[4]

Joseph Beck, Beverly Park Woolf, and Carole R Beal. 2000. ADVISOR: A machine learning architecture for intelligent tutor construction. AAAI/IAAI, Vol. 2000, 552--557 (2000), 1--2.

[5]

Min Chi, Kurt VanLehn, Diane Litman, and Pamela Jordan. 2011a. Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies. User Modeling and User-Adapted Interaction, Vol. 21, 1--2 (2011), 137--180.

Digital Library

[6]

Min Chi, Kurt VanLehn, Diane Litman, and Pamela Jordan. 2011b. An evaluation of pedagogical tutorial tactics for a natural language tutoring system: A reinforcement learning approach. International Journal of Artificial Intelligence in Education, Vol. 21, 1--2 (2011), 83--113.

Digital Library

[7]

Diana I Cordova and Mark R Lepper. 1996. Intrinsic motivation and the process of learning: Beneficial effects of contextualization, personalization, and choice. Journal of educational psychology, Vol. 88, 4 (1996), 715.

[8]

Heriberto Cuayáhuitl et al. 2010. Generating adaptive route instructions using hierarchical reinforcement learning. In International Conference on Spatial Cognition. Springer, 319--334.

[9]

Edward L Deci, Haleh Eghrari, Brian C Patrick, and Dean R Leone. 1994. Facilitating internalization: The self-determination theory perspective. Journal of personality, Vol. 62, 1 (1994), 119--142.

[10]

Shayan Doroudi, Kenneth Holstein, Vincent Aleven, and Emma Brunskill. 2015. Towards Understanding How to Leverage Sense-Making, Induction and Refinement, and Fluency to Improve Robust Learning. International Educational Data Mining Society (2015).

[11]

Martha Evens and Joel Michael. 2006. One-on-one tutoring by humans and computers .Psychology Press.

[12]

Daniel Hein, Steffen Udluft, and Thomas A Runkler. 2018. Interpretable policies for reinforcement learning by genetic programming. Engineering Applications of Artificial Intelligence, Vol. 76 (2018), 158--169.

[13]

Ana Iglesias, Paloma Mart'inez, Ricardo Aler, and Fernando Fernández. 2009. Reinforcement learning of pedagogical policies in adaptive and intelligent educational systems. Knowledge-Based Systems, Vol. 22, 4 (2009), 266--270.

Digital Library

[14]

Hyungshim Jang. 2008. Supporting students' motivation, engagement, and learning during an uninteresting activity. Journal of Educational Psychology, Vol. 100, 4 (2008), 798.

[15]

Mable B Kinzie and Howard J Sullivan. 1989. Continuing motivation, learner control, and CAI. Educational Technology Research and Development, Vol. 37, 2 (1989), 5--14.

[16]

Kenneth R Koedinger, John R Anderson, William H Hadley, and Mary A Mark. 1997. Intelligent tutoring goes to school in the big city. (1997).

[17]

Kenneth R Koedinger, Emma Brunskill, Ryan SJd Baker, Elizabeth A McLaughlin, and John Stamper. 2013. New potentials for data-driven intelligent tutoring system development and optimization. AI Magazine, Vol. 34, 3 (2013), 27--41.

Digital Library

[18]

Alfie Kohn. 1993. Choices for children. Phi Delta Kappan, Vol. 75, 1 (1993), 8--20.

[19]

Tejas D Kulkarni, Karthik Narasimhan, Ardavan Saeedi, and Josh Tenenbaum. 2016. Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation. In Advances in neural information processing systems. 3675--3683.

[20]

Mark R Lepper, Maria Woolverton, Donna L Mumme, and J Gurtner. 1993. Motivational techniques of expert human tutors: Lessons for the design of computer-based tutors. Computers as cognitive tools, Vol. 1993 (1993), 75--105.

[21]

Francis Maes, Raphael Fonteneau, Louis Wehenkel, and Damien Ernst. 2012. Policy search in a space of simple closed-form formulas: Towards interpretability of reinforcement learning. In International Conference on Discovery Science. Springer, 37--51.

[22]

Travis Mandel, Yun-En Liu, Sergey Levine, Emma Brunskill, and Zoran Popovic. 2014. Offline policy evaluation across representations with applications to educational games. In Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems. International Foundation for Autonomous Agents and Multiagent Systems, 1077--1084.

Digital Library

[23]

Bruce M McLaren and Seiji Isotani. 2011. When is it best to learn with all worked examples?. In International Conference on Artificial Intelligence in Education. Springer, 222--229.

[24]

Bruce M McLaren, Sung-Joo Lim, and Kenneth R Koedinger. 2008. When and how often should worked examples be given to students? New results and a summary of the current state of research. In Proceedings of the 30th annual conference of the cognitive science society. 2176--2181.

[25]

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. 2011. Scikit-learn: Machine learning in Python. Journal of machine learning research, Vol. 12, Oct (2011), 2825--2830.

Digital Library

[26]

Xue Bin Peng, Glen Berseth, KangKang Yin, and Michiel Van De Panne. 2017. Deeploco: Dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Transactions on Graphics (TOG), Vol. 36, 4 (2017), 41.

Digital Library

[27]

Pipatsarun Phobun and Jiracha Vicheanpanya. 2010. Adaptive intelligent tutoring systems for e-learning systems. Procedia - Social and Behavioral Sciences, Vol. 2, 2 (2010), 4064 -- 4069. https://doi.org/ Innovation and Creativity in Education.

[28]

Anna N Rafferty, Emma Brunskill, Thomas L Griffiths, and Patrick Shafto. 2016. Faster teaching via pomdp planning. Cognitive science, Vol. 40, 6 (2016), 1290--1332.

[29]

Carl Edward Rasmussen. 2004. Gaussian processes in machine learning. In Advanced lectures on machine learning. Springer, 63--71.

[30]

Johnmarshall Reeve, Hyungshim Jang, Pat Hardre, and Mafumi Omura. 2002. Providing a rationale in an autonomy-supportive way as a strategy to motivate others during an uninteresting activity. Motivation and emotion, Vol. 26, 3 (2002), 183--207.

[31]

Jonathan Rowe, Bradford Mott, and James Lester. 2014. Optimizing player experience in interactive narrative planning: a modular reinforcement learning approach. In Tenth Artificial Intelligence and Interactive Digital Entertainment Conference .

[32]

Jonathan P Rowe and James C Lester. 2015. Improving student problem solving in narrative-centered learning environments: A modular reinforcement learning framework. In International Conference on Artificial Intelligence in Education. Springer, 419--428.

[33]

Malcolm Ryan and Mark Reid. 2000. Learning to fly: An application of hierarchical reinforcement learning. In In Proceedings of the 17th International Conference on Machine Learning. Citeseer.

[34]

Ron JCM Salden, Vincent Aleven, Rolf Schwonke, and Alexander Renkl. 2010. The expertise reversal effect and worked examples in tutored problem solving. Instructional Science, Vol. 38, 3 (2010), 289--307.

[35]

Carol Sansone, Charlene Weir, Lora Harpster, and Carolyn Morgan. 1992. Once a boring task always a boring task? Interest as a self-regulatory mechanism. Journal of personality and social psychology, Vol. 63, 3 (1992), 379.

[36]

Carol Sansone, Deborah J Wiebe, and Carolyn Morgan. 1999. Self-regulating interest: The moderating role of hardiness and conscientiousness. Journal of personality, Vol. 67, 4 (1999), 701--733.

[37]

Gregory Schraw, Terri Flowerday, and Marcy F Reisetter. 1998. The role of choice in reader engagement. Journal of Educational Psychology, Vol. 90, 4 (1998), 705.

[38]

Devin Schwab and Soumya Ray. 2017. Offline reinforcement learning with task hierarchies. Machine Learning, Vol. 106, 9--10 (2017), 1569--1598.

[39]

Rolf Schwonke, Alexander Renkl, Carmen Krieg, Jörg Wittwer, Vincent Aleven, and Ron Salden. 2009. The worked-example effect: Not an artefact of lousy control conditions. Computers in Human Behavior, Vol. 25, 2 (2009), 258--266.

Digital Library

[40]

Shitian Shen and Min Chi. 2016a. Aim Low: Correlation-Based Feature Selection for Model-Based Reinforcement Learning. International Educational Data Mining Society (2016).

[41]

Shitian Shen and Min Chi. 2016b. Reinforcement Learning: the Sooner the Better, or the Later the Better?. In Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization. ACM, 37--44.

Digital Library

[42]

Hsin-Yih Shyu and Scott W Brown. 1992. Learner control versus program control in interactive videodisc instruction: What are the effects in procedural learning. International Journal of Instructional Media, Vol. 19, 2 (1992), 85--95.

[43]

Richard S Sutton, Doina Precup, and Satinder Singh. 1999. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, Vol. 112, 1--2 (1999), 181--211.

Digital Library

[44]

Kurt Vanlehn. 2006. The behavior of tutoring systems. International journal of artificial intelligence in education, Vol. 16, 3 (2006), 227--265.

Digital Library

[45]

Xin Wang, Wenhu Chen, Jiawei Wu, Yuan-Fang Wang, and William Yang Wang. 2018. Video captioning via hierarchical reinforcement learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4213--4222.

[46]

Shiou-Wen Yeh and James D Lehman. 2001. Effects of learner control and learning strategies on English as a foreign language (EFL) learning from interactive hypermedia lessons. Journal of Educational Multimedia and Hypermedia, Vol. 10, 2 (2001), 141--159.

Digital Library

[47]

Guojing Zhou, Hamoon Azizsoltani, Markel Sanz Ausin, Tiffany Barnes, and Min Chi. 2019 a. Hierarchical reinforcement learning for pedagogical policy induction. In Artificial Intelligence in Education - 20th International Conference, AIED 2019, Chicago, IL, USA, June 25--29, 2019, Proceedings, Part I. Springer, 544--556.

[48]

Guojing Zhou, Thomas W Price, Collin Lynch, Tiffany Barnes, and Min Chi. 2015. The Impact of Granularity on Worked Examples and Problem Solving. In Proceedings of the 37th annual conference of the cognitive science society. 2817--2822.

[49]

Guojing Zhou, Jianxun Wang, Collin Lynch, and Min Chi. 2017. Towards Closing the Loop: Bridging Machine-induced Pedagogical Policies to Learning Theories. In EDM .

[50]

Guojing Zhou, Xi Yang, and Min Chi. 2019 b. Big, Little, or Both? Exploring the Impact of Granularity on Learning for Students with Different Incoming Competence. In Proceedings of the 41th annual conference of the cognitive science society. 3206--3212.

Cited By

Gao HMa B(2024)Online Learning Strategy Induction through Partially Observable Markov Decision Process-Based Cognitive Experience ModelElectronics10.3390/electronics1319385813:19(3858)Online publication date: 29-Sep-2024
https://doi.org/10.3390/electronics13193858
Gao GJu SAusin MChi MAgmon NAn BRicci AYeoh W(2023)HOPE: Human-Centric Off-Policy Evaluation for E-Learning and HealthcareProceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems10.5555/3545946.3598804(1504-1513)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.5555/3545946.3598804
Ognibene DWilkens RTaibi DHernández-Leo DKruschwitz UDonabauer GTheophilou ELomonaco FBursic SLobo RSánchez-Reina JScifo LSchwarze VBörsting JHoppe UAprin FMalzahn NEimler S(2023)Challenging social media threats using collective well-being-aware recommendation algorithms and an educational virtual companionFrontiers in Artificial Intelligence10.3389/frai.2022.6549305Online publication date: 9-Jan-2023
https://doi.org/10.3389/frai.2022.654930
Show More Cited By

Index Terms

Improving Student-System Interaction Through Data-driven Explanations of Hierarchical Reinforcement Learning Induced Pedagogical Policies
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Empirical studies in HCI

Recommendations

Hierarchical Reinforcement Learning: A Comprehensive Survey

Hierarchical Reinforcement Learning (HRL) enables autonomous decomposition of challenging long-horizon decision-making tasks into simpler subtasks. During the past years, the landscape of HRL research has grown profoundly, resulting in copious ...
Exploring the Impact of Simple Explanations and Agency on Batch Deep Reinforcement Learning Induced Pedagogical Policies
Artificial Intelligence in Education
Abstract
In recent years, Reinforcement learning (RL), especially Deep RL (DRL), has shown outstanding performance in video games from Atari, Mario, to StarCraft. However, little evidence has shown that DRL can be successfully applied to real-life human-...
Hierarchical Reinforcement Learning for Pedagogical Policy Induction
Artificial Intelligence in Education
Abstract
In interactive e-learning environments such as Intelligent Tutoring Systems, there are pedagogical decisions to make at two main levels of granularity: whole problems and single steps. Recent years have seen growing interest in data-driven ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

UMAP '20: Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization

July 2020

426 pages

ISBN:9781450368612

DOI:10.1145/3340631

Editors:
Tsvi Kuflik
University of Haifa, Israel
,
Ilaria Torre
University of Genoa, Italy
,
Robin Burke
University of Colorado, Boulder, USA
,
Cristina Gena
University of Turin, Italy

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 July 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

UMAP '20

Sponsor:

UMAP '20: 28th ACM Conference on User Modeling, Adaptation and Personalization

July 14 - 17, 2020

Genoa, Italy

Acceptance Rates

Overall Acceptance Rate 162 of 633 submissions, 26%

Upcoming Conference

UMAP '25

Sponsor:
sigchi
sigchi

33rd ACM Conference on User Modeling, Adaptation and Personalization

June 16 - 19, 2025

New York City , NY , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
296
Total Downloads

Downloads (Last 12 months)22
Downloads (Last 6 weeks)2

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Gao HMa B(2024)Online Learning Strategy Induction through Partially Observable Markov Decision Process-Based Cognitive Experience ModelElectronics10.3390/electronics1319385813:19(3858)Online publication date: 29-Sep-2024
https://doi.org/10.3390/electronics13193858
Gao GJu SAusin MChi MAgmon NAn BRicci AYeoh W(2023)HOPE: Human-Centric Off-Policy Evaluation for E-Learning and HealthcareProceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems10.5555/3545946.3598804(1504-1513)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.5555/3545946.3598804
Ognibene DWilkens RTaibi DHernández-Leo DKruschwitz UDonabauer GTheophilou ELomonaco FBursic SLobo RSánchez-Reina JScifo LSchwarze VBörsting JHoppe UAprin FMalzahn NEimler S(2023)Challenging social media threats using collective well-being-aware recommendation algorithms and an educational virtual companionFrontiers in Artificial Intelligence10.3389/frai.2022.6549305Online publication date: 9-Jan-2023
https://doi.org/10.3389/frai.2022.654930
Gao HZeng YMa BPan Y(2023)Improving Knowledge Learning Through Modelling Students’ Practice-Based Cognitive ProcessesCognitive Computation10.1007/s12559-023-10201-z16:1(348-365)Online publication date: 29-Sep-2023
https://doi.org/10.1007/s12559-023-10201-z
Ju SYang XBarnes TChi M(2022)Student-Tutor Mixed-Initiative Decision-Making Supported by Deep Reinforcement LearningArtificial Intelligence in Education10.1007/978-3-031-11644-5_36(440-452)Online publication date: 27-Jul-2022
https://doi.org/10.1007/978-3-031-11644-5_36
Conati CBarral OPutnam VRieger L(2021)Toward personalized XAIArtificial Intelligence10.1016/j.artint.2021.103503298:COnline publication date: 1-Sep-2021
https://dl.acm.org/doi/10.1016/j.artint.2021.103503
Zhou GAzizsoltani HAusin MBarnes TChi M(2021)Leveraging Granularity: Hierarchical Reinforcement Learning for Pedagogical Policy InductionInternational Journal of Artificial Intelligence in Education10.1007/s40593-021-00269-932:2(454-500)Online publication date: 16-Aug-2021
https://doi.org/10.1007/s40593-021-00269-9
Barria-Pineda JAkhuseyinoglu KŽelem-Ćelap SBrusilovsky PMilicevic AIvanovic M(2021)Explainable Recommendations in a Personalized Programming Practice SystemArtificial Intelligence in Education10.1007/978-3-030-78292-4_6(64-76)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1007/978-3-030-78292-4_6
Ju SZhou GAbdelshiheed MBarnes TChi M(2021)Evaluating Critical Reinforcement Learning Framework in the FieldArtificial Intelligence in Education10.1007/978-3-030-78292-4_18(215-227)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1007/978-3-030-78292-4_18
Fahid FRowe JSpain RGoldberg BPokorny RLester J(2021)Adaptively Scaffolding Cognitive Engagement with Batch Constrained Deep Q-NetworksArtificial Intelligence in Education10.1007/978-3-030-78292-4_10(113-124)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1007/978-3-030-78292-4_10

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten