ABSTRACT
Early prediction of students at risk of doing poorly in CS1 can enable early interventions or class adjustments. Preferably, prediction methods would be lightweight, not requiring much extra activity or data-collection work from instructors beyond what they already do. Previous methods included giving surveys, collecting (potentially sensitive) demographic data, introducing clicker questions into lectures, or using locally-developed systems that analyze programming behavior, each requiring some effort by instructors. Today, a widely used textbook / learning system in CS1 classes is zyBooks, used by several hundred thousand students annually. The system automatically collects data related to reading, homework, and programming assignments. For a 300+ student CS1 class, we found that three data metrics, auto-collected by that system in early weeks (1-4), were good at predicting performance on the week-6 midterm exam: non-earnest completion of the assigned readings, struggle on the coding homework, and low scores on the programming assignments, with correlation magnitudes of 0.44, 0.58, and 0.72, respectively. We combined those metrics in a decision tree model to predict students at-risk of failing the midterm exam (<70%, meaning D or F), and achieved 85% prediction accuracy with 82% sensitivity and 89% specificity, which is higher than previously-published early-prediction approaches. The approach may mean that thousands of instructors already using zyBooks (or a similar system) can get a more accurate early prediction of at-risk students, without requiring extra effort or activities, and avoiding collection of sensitive demographic data.
- Ahadi, A., Lister, R., Haapala, H., & Vihavainen, A. (2015, August). Exploring machine learning methods to automatically identify students in need of assistance. In Proceedings of the eleventh annual international conference on international computing education research (pp. 121--130).Google ScholarDigital Library
- Alspaugh, C. A. (1972). Identification of some components of computer programming aptitude. Journal for Research in Mathematics Education, 89--98.Google ScholarCross Ref
- CodeHS, https://codehs.com/, accessed 2022.Google Scholar
- Campbell, J., Horton, D., & Craig, M. (2016, July). Factors for success in online CS1. In Proceedings of the 2016 acm conference on innovation and technology in computer science education (pp. 320--325).Google ScholarDigital Library
- Codio, https://www.codio.com/, accessed 2022.Google Scholar
- Edgcomb, A. and Vahid, F., 2015, October. How many points should be awarded for interactive textbook reading assignments?. In 2015 IEEE Frontiers in Education Conference (FIE) (pp. 1--4). IEEE.Google Scholar
- Fire, M., Katz, G., Elovici, Y., Shapira, B., & Rokach, L. (2012). Predicting student exam's scores by analyzing social network data. In Active Media Technology (pp. 584--595). Springer Berlin Heidelberg. [7] Fowler, G. C., & Glorfeld, L. W. (1981). Predicting aptitude in introductoryGoogle ScholarDigital Library
- Fowler, G. C., & Glorfeld, L. W. (1981). Predicting aptitude in introductory computing: A classification model. AEDS Journal, 14(2), 96--109.Google ScholarCross Ref
- Gordanier, J., Hauk, W., & Sankaran, C. (2019). Early intervention in college classes and improved student outcomes. Economics of Education Review, 72, 23--29.Google ScholarCross Ref
- Gordon, C., Lysecky, R. and Vahid, F., 2021. The rise of the zyLab program auto-grader in introductory CS courses. zyBooks.com, White Paper.Google Scholar
- Harrington, B. (2017, May). Get On Track, Stay On Track: Combining Early Intervention and Study Skills in a CS1 Seminar. In Proceedings of the 22nd Western Canadian Conference on Computing Education (pp. 1--1).Google Scholar
- Jiang, S., Warschauer, M., Williams, A. E., O?Dowd, D., & Schenke, K. (2014). Predicting MOOC Performance with Week 1 Behavior. In Proceedings of the 7th International Conference on Educational Data Mining.Google Scholar
- Kloft, M., Stiehler, F., Zheng, Z., & Pinkwart, N. (2014). Predicting MOOC Dropout over Weeks Using Machine Learning Methods. EMNLP 2014, 60Google ScholarCross Ref
- Krause-Levy, S., Porter, L., Simon, B., & Alvarado, C. (2020, February). Investigating the impact of employing multiple interventions in a cs1 course. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education (pp. 1082--1088).Google ScholarDigital Library
- Liao, S. N., Zingaro, D., Laurenzano, M. A., Griswold, W. G., & Porter, L. (2016, August). Lightweight, early identification of at-risk CS1 students. In Proceedings of the 2016 acm conference on international computing education research (pp. 123--131).Google ScholarDigital Library
- Liao, S. N., Valstar, S., Thai, K., Alvarado, C., Zingaro, D., Griswold, W. G., & Porter, L. (2019, July). Behaviors of higher and lower performing students in CS1. In Proceedings of the 2019 ACM Conference on Innovation and Technology in Computer Science Education (pp. 196--202).Google ScholarDigital Library
- Luxton-Reilly, A., Ajanovski, V. V., Fouh, E., Gonsalvez, C., Leinonen, J., Parkinson, J., Poole, M., and Thota, N. (2019). Pass rates in introductory programming and in other stem disciplines. In Proceedings of the Working Group Reports on Innovation and Technology in Computer Science Education (pp. 53--71).Google Scholar
- Navstem CS1 report, https://navstem.com/, 2021.Google Scholar
- Quille, K., & Bergin, S. (2019). CS1: how will they do? How can we help? A decade of research and practice. Computer Science Education, 29(2--3), 254--282.Google Scholar
- Quille, K., & Bergin, S. (2018, July). Programming: predicting student success early in CS1. a re-validation and replication study. In Proceedings of the 23rd annual ACM conference on innovation and technology in computer science education (pp. 15--20).Google ScholarDigital Library
- Rountree, N., Rountree, J., & Robins, A. (2002). Predictors of success and failure in a CS1 course. ACM SIGCSE Bulletin, 34(4), 121--124.Google ScholarDigital Library
- Rountree, N., Rountree, J., Robins, A., & Hannah, R. (2004). Interacting factors that predict success and failure in a CS1 course. ACM SIGCSE Bulletin, 36(4), 101--104.Google ScholarDigital Library
- Rosé, C. P., & Siemens, G. (2014). Shared task on prediction of dropout over time in massively open online courses. EMNLP 2014, 39.Google ScholarCross Ref
- Runestone Academy, https://runestone.academy, accessed 2022.Google Scholar
- Sharkey, M., & Sanders, R. (2014). A Process for Predicting MOOC Attrition.EMNLP 2014, 50.Google Scholar
- Ventura Jr, P. R. (2005). Identifying predictors of success for an objects-first CS1.Google ScholarCross Ref
- Ventura, P., & Ramamurthy, B. (2004). Wanted: CS1 students. No experience required. ACM SIGCSE Bulletin, 36(1), 240--244.Google ScholarDigital Library
- A. Vihavainen. Predicting students' performance in an introductory programming course using data from students' own programming process. In IEEE 13th International Conference on Advanced Learning Technologies, pages 498--499, 2013.Google ScholarDigital Library
- Watson, C., Li, F. W., & Godwin, J. L. (2014, March). No tests required: comparing traditional and dynamic predictors of programming success. In Proceedings of the 45th ACM technical symposium on Computer science education (pp. 469--474).Google ScholarDigital Library
- West, M., Herman, G.L. and Zilles, C., 2015, June. Prairielearn: Mastery-based online problem solving with adaptive scoring and recommendations driven by machine learning. In 2015 ASEE Annual Conference & Exposition (pp. 26--1238). https://www.prairielearn.org/, accessed 2022.Google Scholar
- Wilson, B. C., & Shrock, S. (2001, February). Contributing to success in an introductory computer science course: a study of twelve factors. In ACM SIGCSE Bulletin (Vol. 33, No. 1, pp. 184--188). ACM.Google Scholar
- Zhang, Y., Fei, Q., Quddus, M., & Davis, C. (2014). An examination of the impact of early intervention on learning outcomes of at-risk students. Research in Higher Education Journal, 26.Google Scholar
- zyBooks.com, accessed 2022.Google Scholar
Index Terms
- Ultra-Lightweight Early Prediction of At-Risk Students in CS1
Recommendations
Lightweight, Early Identification of At-Risk CS1 Students
ICER '16: Proceedings of the 2016 ACM Conference on International Computing Education ResearchBeing able to identify low-performing students early in the term may help instructors intervene or differently allocate course resources. Prior work in CS1 has demonstrated that clicker correctness in Peer Instruction courses correlates with exam ...
Impact of Student Time Spent on Performance in a CS1 Class, Including Prior Experience Effect
ITiCSE 2023: Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 2Computer science instructors have long advised students that success in CS1 requires many hours, such as 8-10 hours/week outside class time, but students often don't believe it. Recently, the most-widely used CS1 learning system (zyBooks), which is web-...
Programming: predicting student success early in CS1. a re-validation and replication study
ITiCSE 2018: Proceedings of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science EducationThis paper describes a large, multi-institutional revalidation study conducted in the academic year 2015-16. Six hundred and ninety-two students participated in this study, from 11 institutions (ten institutions in Ireland and one in Denmark). The ...
Comments