research-article

A Fast and Accurate Machine Learning Autograder for the Breakout Assignment

Authors:

Evan Zheran Liu,

Elyse Cornwall,

Juliette Woodrow,

Emma Brunskill,

Chelsea FinnAuthors Info & Claims

SIGCSE 2024: Proceedings of the 55th ACM Technical Symposium on Computer Science Education V. 1

Pages 736 - 742

https://doi.org/10.1145/3626252.3630759

Published: 07 March 2024 Publication History

Abstract

In this paper, we detail the successful deployment of a machine learning autograder that significantly decreases the grading labor required in the Breakout computer science assignment. This assignment - which tasks students with programming a game consisting of a controllable paddle and a ball that bounces off the paddle to break bricks - is popular for engaging students with introductory computer science concepts, but creates a large grading burden. Due to the game's interactive nature, grading defies traditional unit tests and instead typically requires 8+ minutes of manually playing each student's game to search for bugs. This amounts to 45+ hours of grading in a standard course offering and prevents further widespread adoption of the assignment. Our autograder alleviates this burden by playing each student's game with a reinforcement learning agent and providing videos of discovered bugs to instructors. In an A/B test with manual grading, we find that our human-in-the-loop AI autograder reduces grading time by 44%, while slightly improving grading accuracy by 6%, ultimately saving roughly 30 hours over our deployment in two offerings of the assignment. Our results further suggest the practicality of grading other interactive assignments (e.g., other games or building websites) via similar machine learning techniques. Live demo at https://ezliu.github.io/breakoutgrader.

References

[1]

Paulo Battistella and C Gresse von Wangenheim. 2016. Games for teaching computing in higher education--a systematic review. IEEE Technology and Engineering Education, Vol. 9, 1 (2016), 8--30.

[2]

Sahil Bhatia and Rishabh Singh. 2016. Automated correction for syntax errors in programming assignments using recurrent neural networks. arXiv preprint arXiv:1603.06129 (2016).

[3]

Gail Carmichael. 2008. Girls, computer science, and games. ACM SIGCSE Bulletin, Vol. 40, 4 (2008), 107--110.

Digital Library

[4]

Code.org. 2022. Code.org. https://code.org/about.

[5]

Jeffrey E Froyd, Phillip C Wankat, and Karl A Smith. 2012. Five major shifts in 100 years of engineering education. Proc. IEEE, Vol. 100, 0 (2012), 1344--1360.

[6]

Elena L Glassman, Jeremy Scott, Rishabh Singh, Philip J Guo, and Robert C Miller. 2015. OverCode: Visualizing variation in student solutions to programming problems at scale. In Conference on Human Factors in Computing Systems (CHI). 1--35.

Digital Library

[7]

Georgiana Haldeman, Andrew Tjang, Monica Babecs -Vroman, Stephen Bartos, Jay Shah, Danielle Yucht, and Thu D Nguyen. 2018. Providing meaningful feedback for autograding of programming assignments. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education. 278--283.

Digital Library

[8]

Qiang Hao, David H Smith IV, Lu Ding, Amy Ko, Camille Ottaway, Jack Wilson, Kai H Arakawa, Alistair Turcan, Timothy Poehlman, and Tyler Greer. 2022. Towards understanding the effective design of automated formative feedback for programming assignments. Computer Science Education, Vol. 32, 1 (2022), 105--127.

[9]

Andrew Head, Elena Glassman, Gustavo Soares, Ryo Suzuki, Lucas Figueredo, Loris D'Antoni, and Björn Hartmann. 2017. Writing reusable code feedback at scale with mixed-initiative program synthesis. In Proceedings of the Fourth (2017) ACM Conference on Learning@ Scale. 89--98.

Digital Library

[10]

Stan Kurkovsky. 2009. Engaging students through mobile game development. ACM SIGCSE Bulletin, Vol. 41, 1 (2009), 44--48.

Digital Library

[11]

Abe Leite and Saúl A Blanco. 2020. Effects of human vs. automatic feedback on students' understanding of AI concepts and programming style. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education. 44--50.

Digital Library

[12]

Scott Leutenegger and Jeffrey Edgington. 2007. A games first approach to teaching introductory programming. In Proceedings of the 38th SIGCSE technical symposium on Computer science education. 115--118.

Digital Library

[13]

Evan Liu, Moritz Stephan, Allen Nie, Chris Piech, Emma Brunskill, and Chelsea Finn. 2022. Giving Feedback on Interactive Student Programs with Meta-Exploration. Advances in Neural Information Processing Systems, Vol. 35 (2022), 36282--36294.

[14]

Evan Zheran Liu, Aditi Raghunathan, Percy Liang, and Chelsea Finn. 2021. Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices. In International Conference on Machine Learning (ICML).

[15]

Ali Malik, Mike Wu, Vrinda Vasavada, Jinpeng Song, Madison Coots, John Mitchell, Noah Goodman, and Chris Piech. 2021. Generative Grading: Near Human-Level Accuracy for Automated Feedback on Richly Structured Problems. International Educational Data Mining Society (2021).

[16]

Ali Malik, Mike Wu, Vrinda Vasavada, Jinpeng Song, John Mitchell, Noah Goodman, and Chris Piech. 2019. Generative grading: Neural approximate parsing for automated student feedback. arXiv preprint arXiv:1905.09916 (2019).

[17]

Jessica McBroom, Irena Koprinska, and Kalina Yacef. 2021. A survey of automated programming hint generation: The hints framework. ACM Computing Surveys (CSUR), Vol. 54, 8 (2021), 1--27.

Digital Library

[18]

Alexandra Ann Milliken. 2021. Redesigning How Teachers Learn, Teach, and Assess Computing with Block-Based Languages in Their Classroom. North Carolina State University.

[19]

Divyansh Shankar Mishra. 2023. The Programming Exercise Markup Language: A Teacher-Oriented Format for Describing Auto-graded Assignments. Ph.,D. Dissertation. Virginia Tech.

[20]

Briana B Morrison and Jon A Preston. 2009. Engagement: Gaming throughout the curriculum. ACM SIGCSE Bulletin, Vol. 41, 1 (2009), 342--346.

Digital Library

[21]

Allen Nie, Emma Brunskill, and Chris Piech. 2021. Play to Grade: Testing Coding Games as Classifying Markov Decision Process. In Advances in Neural Information Processing Systems (NeurIPS).

[22]

Isabela Ortiz Jaramillo et al. 2023. A summer introductory programming course with diversity awareness. (2023).

[23]

José Carlos Paiva, José Paulo Leal, and Álvaro Figueira. 2022. Automated assessment in computer science education: A state-of-the-art review. ACM Transactions on Computing Education (TOCE), Vol. 22, 3 (2022), 1--40.

Digital Library

[24]

Nick Parlante, Steven A Wolfman, Lester I McCann, Eric Roberts, Chris Nevison, John Motil, Jerry Cain, and Stuart Reges. 2006. Nifty assignments. In Proceedings of the 37th SIGCSE technical symposium on Computer science education. 562--563.

Digital Library

[25]

Jay A Pfaffman. 2003 2003. Manipulating and measuring student engagement in computer-based instruction. Ph.,D. Dissertation. Vanderbilt University.

[26]

Christopher Piech, Ali Malik, Kylie Jue, and Mehran Sahami. 2021. Code in place: Online section leading for scalable human-centered learning. In Proceedings of the 52nd acm technical symposium on computer science education. 973--979.

Digital Library

[27]

Chris Piech, Lisa Yan, Lisa Einstein, Ana Saavedra, Baris Bozkurt, Eliska Sestakova, Ondrej Guth, and Nick McKeown. 2020. Co-teaching computer science across borders: Human-centric learning at scale. In Proceedings of the Seventh ACM Conference on Learning@ Scale. 103--113.

Digital Library

[28]

Kelly Rivers and Kenneth R Koedinger. 2017. Data-driven hint generation in vast solution spaces: a self-improving python programming tutor. International Journal of Artificial Intelligence in Education, Vol. 27 (2017), 37--64.

[29]

Rishabh Singh, Sumit Gulwani, and Armando Solar-Lezama. 2013. Automated feedback generation for introductory programming assignments. In Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation. 15--26.

Digital Library

[30]

Ke Wang, Benjamin Lin, Bjorn Rettig, Paul Pardi, and Rishabh Singh. 2017. Data-driven feedback generator for online programing courses. In Proceedings of the Fourth (2017) ACM Conference on Learning@ Scale. 257--260.

Digital Library

[31]

Mike Wu, Noah Goodman, Chris Piech, and Chelsea Finn. 2021. Prototransformer: A meta-learning approach to providing student feedback. arXiv preprint arXiv:2107.14035 (2021).

[32]

Lisa Yan, Nick McKeown, and Chris Piech. 2019. The pyramidsnapshot challenge: Understanding student process from visual output of programs. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education. 119--125.

Digital Library

[33]

Lisa Yan, Nick McKeown, Mehran Sahami, and Chris Piech. 2018. TMOSS: Using intermediate assignment work to understand excessive collaboration in large classes. In Proceedings of the 49th ACM technical symposium on computer science education. 110--115.

Digital Library

[34]

Jeremy K Zhang, Chao Hsu Lin, Melissa Hovik, and Lauren J Bricker. 2020. GitGrade: A Scalable Platform Improving Grading Experiences. In SIGCSE. 1284.

Index Terms

A Fast and Accurate Machine Learning Autograder for the Breakout Assignment
1. Social and professional topics
  1. Professional topics
    1. Computing education
      1. Computing education programs
        Computer science education
        CS1

Recommendations

Using Machine Learning to Predict Game Outcomes Based on Player-Champion Experience in League of Legends
FDG '21: Proceedings of the 16th International Conference on the Foundations of Digital Games

League of Legends (LoL) is the most widely played multiplayer online battle arena (MOBA) game in the world. An important aspect of LoL is competitive ranked play, which utilizes a skill-based matchmaking system to form fair teams. However, players’ ...
Digitalizing Paper-Based Exams: An Assessment of Programming Grading Assistant
SIGCSE '17: Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education

In this study, we evaluate a new electronic mobile application, PGA (Programming Grading Assistant). It was designed to make grading paper-based exams easier for graders and professors. Not only does it facilitate grading, but PGA also provides students ...
Champion Recommendation in League of Legends Using Machine Learning
Computational Science – ICCS 2023
Abstract
League of Legends (LoL) is a Multiplayer Online Battle Arena (MOBA) game with over 160 champions and a competitive esports scene. The ban-and-pick system allows players to choose champions before a match, which can greatly impact the outcome, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGCSE 2024: Proceedings of the 55th ACM Technical Symposium on Computer Science Education V. 1

March 2024

1583 pages

ISBN:9798400704239

DOI:10.1145/3626252

General Chairs:
Ben Stephenson
University of Calgary, Canada6000230660002306
,
Jeffrey A. Stone
Penn State University6000143960001439
,
Program Chairs:
Lina Battestilli
North Carolina State University, USA6000492360004923
,
Samuel A. Rebelsky
Grinnell College60028806
,
Libby Shoop
Macalester College60028787

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGCSE: ACM Special Interest Group on Computer Science Education

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 March 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

SIGCSE 2024

Sponsor:

SIGCSE

SIGCSE 2024: The 55th ACM Technical Symposium on Computer Science Education

March 20 - 23, 2024

OR, Portland, USA

Acceptance Rates

Overall Acceptance Rate 1,595 of 4,542 submissions, 35%

Upcoming Conference

SIGCSE TS 2025

Sponsor:
sigcse

The 56th ACM Technical Symposium on Computer Science Education

February 26 - March 1, 2025

Pittsburgh , PA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
247
Total Downloads

Downloads (Last 12 months)247
Downloads (Last 6 weeks)8

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten