research-article

Robust Evaluation Matrix: Towards a More Principled Offline Exploration of Instructional Policies

Authors:

Shayan Doroudi,

Vincent Aleven,

Emma BrunskillAuthors Info & Claims

L@S '17: Proceedings of the Fourth (2017) ACM Conference on Learning @ Scale

Pages 3 - 12

https://doi.org/10.1145/3051457.3051463

Published: 12 April 2017 Publication History

Abstract

The gold standard for identifying more effective pedagogical approaches is to perform an experiment. Unfortunately, frequently a hypothesized alternate way of teaching does not yield an improved effect. Given the expense and logistics of each experiment, and the enormous space of potential ways to improve teaching, it would be highly preferable if it were possible to estimate in advance of running a study whether an alternative teaching strategy would improve learning. This is true even in learning at scale situations, since even if it is logistically easier to recruit a large number of subjects, it remains a high stakes environment because the experiment is impacting many real students. For certain classes of alternate teaching approaches, such as new ways to sequence existing material, it is possible to build student models that can be used as simulators to estimate the performance of learners under new proposed teaching methods. However, existing methods for doing so can overestimate the performance of new teaching methods. We instead propose the Robust Evaluation Matrix (REM) method which explicitly considers model mismatch between the student model used to derive the teaching strategy and that used as a simulator to evaluate the teaching strategy effectiveness. We then present two case studies from a fractions intelligent tutoring system and from a concept learning task from prior work that show how REM could be used both to detect when a new instructional policy may not be effective on actual students and to detect when it may be effective in improving student learning.

References

[1]

Joseph Beck, Beverly Park Woolf, and Carole R Beal. 2000. ADVISOR: A machine learning architecture for intelligent tutor construction. AAAI/IAAI 2000 (2000), 552--557.

Digital Library

[2]

Joseph Beck and Xiaolu Xiong. 2013. Limits to accuracy: how well can we do at student modeling?. In Educational Data Mining 2013.

[3]

Min Chi, Kurt VanLehn, Diane Litman, and Pamela Jordan. 2011. Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies. User Modeling and User-Adapted Interaction 21, 1--2 (2011), 137--180.

Digital Library

[4]

Benjamin Clement, Pierre-Yves Oudeyer, and Manuel Lopes. 2016. A Comparison of Automatic Teaching Strategies for Heterogeneous Student Populations. International Educational Data Mining Society (2016).

[5]

Albert Corbett. 2000. Cognitive mastery learning in the ACT programming tutor. Technical Report. AAAI Technical report, SS-00-01.

[6]

Albert T Corbett and John R Anderson. 1994. Knowledge tracing: Modeling the acquisition of procedural knowledge. User modeling and user-adapted interaction 4, 4 (1994), 253--278.

[7]

Shayan Doroudi, Kenneth Holstein, Vincent Aleven, and Emma Brunskill. 2015. Towards Understanding How to Leverage Sense-Making, Induction and Refinement, and Fluency to Improve Robust Learning. International Educational Data Mining Society (2015).

[8]

Miroslav Dudík, John Langford, and Lihong Li. 2011. Doubly robust policy evaluation and learning. arXiv preprint arXiv:1103.4601 (2011).

[9]

José P González-Brenes and Yun Huang. 2015. " Your Model Is Predictive--but Is It Useful?" Theoretical and Empirical Considerations of a New Paradigm for Adaptive Tutoring Evaluation. International Educational Data Mining Society (2015).

[10]

Assaf Hallak, COM François Schnitzler, Timothy Mann, and Shie Mannor. 2015. Off-policy Model-based Learning under Unknown Factored Dynamics. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15). 711--719.

[11]

Nan Jiang and Lihong Li. 2015. Doubly Robust Off-policy Evaluation for Reinforcement Learning. arXiv preprint arXiv:1511.03722 (2015).

[12]

Slava Kalyuga, Paul Ayres, Paul Chandler, and John Sweller. 2003. The expertise reversal effect. Educational psychologist 38, 1 (2003), 23--31.

[13]

Jung In Lee and Emma Brunskill. 2012. The Impact on Individualizing Student Models on Necessary Practice Opportunities. International Educational Data Mining Society (2012).

[14]

Robert V Lindsey, Mohammad Khajah, and Michael C Mozer. 2014. Automatic discovery of cognitive skills to improve the prediction of student learning. In Advances in neural information processing systems. 1386--1394.

[15]

Travis Mandel, Yun-En Liu, Sergey Levine, Emma Brunskill, and Zoran Popovic. 2014. Offline policy evaluation across representations with applications to educational games. In Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems. International Foundation for Autonomous Agents and Multiagent Systems, 1077--1084.

Digital Library

[16]

Shie Mannor, Duncan Simester, Peng Sun, and John N Tsitsiklis. 2007. Bias and variance approximation in value function estimates. Management Science 53, 2 (2007), 308--322.

Digital Library

[17]

Christopher M Mitchell, Kristy Elizabeth Boyer, and James C Lester. Evaluating State Representations for Reinforcement Learning of Turn-Taking Policies in Tutorial Dialogue. In Proceedings of the Fourteenth Annual SIGDIAL Meeting on Discourse and Dialogue (SIGDIAL-2013). 339--343.

[18]

Philip I Pavlik Jr, Hao Cen, and Kenneth R Koedinger. 2009. Performance Factors Analysis--A New Alternative to Knowledge Tracing. Online Submission (2009).

[19]

Chris Piech, Jonathan Bassen, Jonathan Huang, Surya Ganguli, Mehran Sahami, Leonidas J Guibas, and Jascha Sohl-Dickstein. 2015. Deep knowledge tracing. In Advances in Neural Information Processing Systems. 505--513.

[20]

Doina Precup. 2000. Eligibility traces for off-policy policy evaluation. Computer Science Department Faculty Publication Series (2000), 80.

[21]

Anna N Rafferty, Emma Brunskill, Thomas L Griffiths, and Patrick Shafto. 2015. Faster Teaching via POMDP Planning. Cognitive Science (2015).

[22]

Martina A Rau, Vincent Aleven, and Nikol Rummel. 2013. Complementary effects of sense-making and fluency-building support for connection making: A matter of sequence?. In International Conference on Artificial Intelligence in Education. Springer, 329--338.

[23]

Joseph Rollinson and Emma Brunskill. 2015. From Predictive Models to Instructional Policies. International Educational Data Mining Society (2015).

[24]

Jonathan P Rowe and James C Lester. 2015. Improving student problem solving in narrative-centered learning environments: A modular reinforcement learning framework. In International Conference on Artificial Intelligence in Education. Springer, 419--428.

[25]

Jonathan P Rowe, Bradford W Mott, and James C Lester. 2014. Optimizing Player Experience in Interactive Narrative Planning: A Modular Reinforcement Learning Approach. In AIIDE.

[26]

Philip S Thomas and Emma Brunskill. 2016. Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning. arXiv preprint arXiv:1604.00923 (2016).

[27]

Philip S Thomas, Georgios Theocharous, and Mohammad Ghavamzadeh. 2015. High-Confidence Off-Policy Evaluation. In AAAI. 3000--3006.

[28]

Michael Yudelson and Steve Ritter. 2015. Small Improvements for the Model Accuracy -- Big Improvements for the Student. In International Conference on Artificial Intelligence in Education. Springer, 903--905.

[29]

Li Zhou and Emma Brunskill. 2016. Latent Contextual Bandits and their Application to Personalized Recommendations for New Users. arXiv preprint arXiv:1604.06743 (2016).

Cited By

Mavrikis MRummel NWiedmann MLoibl KHolmes W(2022)Combining exploratory learning with structured practice educational technologies to foster both conceptual and procedural fractions knowledgeEducational technology research and development10.1007/s11423-022-10104-0Online publication date: 1-Apr-2022
https://doi.org/10.1007/s11423-022-10104-0
Huang YLobczowski NRichey JMcLaughlin EAsher MHarackiewicz JAleven VKoedinger KScheffel MDowell NJoksimovic SSiemens G(2021)A General Multi-method Approach to Data-Driven Redesign of Tutoring SystemsLAK21: 11th International Learning Analytics and Knowledge Conference10.1145/3448139.3448155(161-172)Online publication date: 12-Apr-2021
https://dl.acm.org/doi/10.1145/3448139.3448155
Zhou GAzizsoltani HAusin MBarnes TChi M(2021)Leveraging Granularity: Hierarchical Reinforcement Learning for Pedagogical Policy InductionInternational Journal of Artificial Intelligence in Education10.1007/s40593-021-00269-932:2(454-500)Online publication date: 16-Aug-2021
https://doi.org/10.1007/s40593-021-00269-9
Show More Cited By

Index Terms

Robust Evaluation Matrix: Towards a More Principled Offline Exploration of Instructional Policies
1. Applied computing
  1. Education
    1. Computer-assisted instruction
2. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Reinforcement learning
        Sequential decision making
    2. Learning settings
      1. Batch learning

Recommendations

Reinforcement Learning for the Adaptive Scheduling of Educational Activities
CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems

Adaptive instruction for online education can increase learning gains and decrease the work required of learners, instructors, and course designers. Reinforcement Learning (RL) is a promising tool for developing instructional policies, as RL models can ...
When to stop?: towards universal instructional policies
LAK '16: Proceedings of the Sixth International Conference on Learning Analytics & Knowledge

The adaptivity of intelligent tutoring systems relies on the accuracy of the student model and the design of the instructional policy. Recently an instructional policy has been presented that is compatible with all common student models. In this work we ...
An evaluation of pedagogical tutorial tactics for a natural language tutoring system: a reinforcement learning approach
Special issue on Best of ITS 2010

Pedagogical strategies are policies for a tutor to decide the next action when there are multiple actions available. When the content is controlled to be the same across experimental conditions, there has been little evidence that tutorial decisions ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

L@S '17: Proceedings of the Fourth (2017) ACM Conference on Learning @ Scale

April 2017

352 pages

ISBN:9781450344500

DOI:10.1145/3051457

General Chair:
Claudia Urrea
Massachusetts Institute of Technology, USA
,
Program Chairs:
Justin Reich
Massachusetts Institute of Technology, USA
,
Candace Thille
Stanford University, USA

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

ACM: Association for Computing Machinery

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 April 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Institute of Education Sciences

Conference

L@S 2017

Sponsor:

ACM

L@S 2017: Fourth (2017) ACM Conference on Learning @ Scale

April 20 - 21, 2017

Massachusetts, Cambridge, USA

Acceptance Rates

L@S '17 Paper Acceptance Rate 14 of 105 submissions, 13%;

Overall Acceptance Rate 117 of 440 submissions, 27%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
387
Total Downloads

Downloads (Last 12 months)21
Downloads (Last 6 weeks)1

Reflects downloads up to 22 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Mavrikis MRummel NWiedmann MLoibl KHolmes W(2022)Combining exploratory learning with structured practice educational technologies to foster both conceptual and procedural fractions knowledgeEducational technology research and development10.1007/s11423-022-10104-0Online publication date: 1-Apr-2022
https://doi.org/10.1007/s11423-022-10104-0
Huang YLobczowski NRichey JMcLaughlin EAsher MHarackiewicz JAleven VKoedinger KScheffel MDowell NJoksimovic SSiemens G(2021)A General Multi-method Approach to Data-Driven Redesign of Tutoring SystemsLAK21: 11th International Learning Analytics and Knowledge Conference10.1145/3448139.3448155(161-172)Online publication date: 12-Apr-2021
https://dl.acm.org/doi/10.1145/3448139.3448155
Zhou GAzizsoltani HAusin MBarnes TChi M(2021)Leveraging Granularity: Hierarchical Reinforcement Learning for Pedagogical Policy InductionInternational Journal of Artificial Intelligence in Education10.1007/s40593-021-00269-932:2(454-500)Online publication date: 16-Aug-2021
https://doi.org/10.1007/s40593-021-00269-9
Bassen JBalaji BSchaarschmidt MThille CPainter JZimmaro DGames AFast EMitchell JBernhaupt RMueller FVerweij DAndres JMcGrenere JCockburn AAvellino IGoguey ABjørn PZhao SSamson BKocielnik R(2020)Reinforcement Learning for the Adaptive Scheduling of Educational ActivitiesProceedings of the 2020 CHI Conference on Human Factors in Computing Systems10.1145/3313831.3376518(1-12)Online publication date: 21-Apr-2020
https://dl.acm.org/doi/10.1145/3313831.3376518
Bigman MMitchell J(2020)Teaching Online in 2020: Experiments, Empathy, Discovery2020 IEEE Learning With MOOCS (LWMOOCS)10.1109/LWMOOCS50143.2020.9234318(156-161)Online publication date: 29-Sep-2020
https://doi.org/10.1109/LWMOOCS50143.2020.9234318
Huang YAleven VMcLaughlin EKoedinger K(2020)A General Multi-method Approach to Design-Loop Adaptivity in Intelligent Tutoring SystemsArtificial Intelligence in Education10.1007/978-3-030-52240-7_23(124-129)Online publication date: 30-Jun-2020
https://doi.org/10.1007/978-3-030-52240-7_23
Doroudi SAleven VBrunskill E(2019)Where’s the Reward?International Journal of Artificial Intelligence in Education10.1007/s40593-019-00187-x29:4(568-620)Online publication date: 14-Nov-2019
https://doi.org/10.1007/s40593-019-00187-x
Kross SGuo PLuckin RKlemmer SKoedinger K(2018)Students, systems, and interactionsProceedings of the Fifth Annual ACM Conference on Learning at Scale10.1145/3231644.3231662(1-10)Online publication date: 26-Jun-2018
https://dl.acm.org/doi/10.1145/3231644.3231662
Holstein KYu ZSewall JPopescu OMcLaren BAleven V(2018)Opening Up an Intelligent Tutoring System Development Environment for Extensible Student ModelingArtificial Intelligence in Education10.1007/978-3-319-93843-1_13(169-183)Online publication date: 20-Jun-2018
https://doi.org/10.1007/978-3-319-93843-1_13

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents