research-article

The Student Zipf Theory: Inferring Latent Structures in Open-Ended Student Work To Help Educators

Authors:
Yunsung Kim

Computer Science, Stanford University, United States

Computer Science, Stanford University, United States

0000-0002-2829-574X
View Profile

,
Chris Piech

Computer Science, Stanford University, United States

Computer Science, Stanford University, United States

0000-0001-5140-0467
View Profile

LAK2023: LAK23: 13th International Learning Analytics and Knowledge ConferenceMarch 2023Pages 464–475https://doi.org/10.1145/3576050.3576116

Published:13 March 2023Publication History

LAK2023: LAK23: 13th International Learning Analytics and Knowledge Conference

Pages 464–475

ABSTRACT

Are there structures underlying student work that are universal across every open-ended task? We demonstrate that, across many subjects and assignment types, the probability distribution underlying student-generated open-ended work is close to Zipf’s Law. Inferring this latent structure for classroom assignments can help learning analytics researchers, instruction designers, and educators understand the landscape of various student approaches, assess the complexity of assignments, and prioritise pedagogical attention. However, typical classrooms are way too small to witness even the contour of the Zipfian pattern, and it is generally impossible to perform inference for Zipf’s law from such small number of samples. We formalise this difficult task as the Zipf Inference Challenge: (1) Infer the ordering of student-generated works by their underlying probabilities, and (2) Estimate the shape parameter of the underlying distribution in a typical-sized classroom. Our key insight in addressing this challenge is to leverage the densities of the student response landscapes represented by semantic similarity. We show that our “Semantic Density Estimation” method is able to do a much better job at inferring the latent Zipf shape and the probability-ordering of student responses for real world education datasets.

References

Laurence Aitchison, Nicola Corradi, and Peter E Latham. 2016. Zipf’s law arises naturally when there are underlying, unobserved variables. PLoS computational biology 12, 12 (2016), e1005110.Google Scholar
Gökhan Akçapınar, Mohammad Nehal Hasnine, Rwitajit Majumdar, Brendan Flanagan, and Hiroaki Ogata. 2019. Developing an early-warning system for spotting at-risk students by using eBook interaction logs. Smart Learning Environments 6, 1 (2019), 1–15.Google ScholarCross Ref
Albert-László Barabási, Réka Albert, and Hawoong Jeong. 1999. Mean-field theory for scale-free random networks. Physica A: Statistical Mechanics and its Applications 272, 1-2(1999), 173–187.Google Scholar
Sumit Basu, Chuck Jacobs, and Lucy Vanderwende. 2013. Powergrading: a clustering approach to amplify human effort for short answer grading. Transactions of the Association for Computational Linguistics 1 (2013), 391–402.Google ScholarCross Ref
Menucha Birenbaum and Kikumi K Tatsuoka. 1987. Open-ended versus multiple-choice response formats—it does make a difference for diagnostic purposes. Applied Psychological Measurement 11, 4 (1987), 385–395.Google ScholarCross Ref
Vladimir V Bochkarev and Eduard Yu Lerner. 2012. Zipf and non-Zipf laws for homogeneous Markov chain. arXiv preprint arXiv:1207.1872(2012).Google Scholar
Vladimir V Bochkarev and Eduard Yu Lerner. 2016. The exact power law and Pascal pyramid. arXiv preprint arXiv:1605.09052(2016).Google Scholar
Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar Rajagopalan, Raymie Stata, Andrew Tomkins, and Janet Wiener. 2000. Graph structure in the web. Computer networks 33, 1-6 (2000), 309–320.Google Scholar
Michael Brooks, Sumit Basu, Charles Jacobs, and Lucy Vanderwende. 2014. Divide and correct: using clusters to grade short answers at scale. In Proceedings of the first ACM conference on Learning@ scale conference. 89–98.Google ScholarDigital Library
John Seely Brown and Kurt VanLehn. 1980. Repair theory: A generative theory of bugs in procedural skills. Cognitive science 4, 4 (1980), 379–426.Google Scholar
David G Champernowne. 1953. A model of income distribution. The Economic Journal 63, 250 (1953), 318–351.Google ScholarCross Ref
Aaron Clauset, Cosma Rohilla Shalizi, and Mark EJ Newman. 2009. Power-law distributions in empirical data. SIAM review 51, 4 (2009), 661–703.Google ScholarDigital Library
Brian Conrad and Michael Mitzenmacher. 2004. Power laws for monkeys typing randomly: the case of unequal probabilities. IEEE Transactions on information theory 50, 7 (2004), 1403–1414.Google ScholarDigital Library
Evandro B Costa, Baldoino Fonseca, Marcelo Almeida Santana, Fabrísia Ferreira de Araújo, and Joilson Rego. 2017. Evaluating the effectiveness of educational data mining techniques for early prediction of students’ academic failure in introductory programming courses. Computers in human behavior 73 (2017), 247–256.Google Scholar
Anna Deluca and Álvaro Corral. 2013. Fitting and goodness-of-fit test of non-truncated and truncated power-law distributions. Acta Geophysica 61, 6 (2013), 1351–1394.Google ScholarCross Ref
John A Erickson, Anthony F Botelho, Steven McAteer, Ashvini Varatharaj, and Neil T Heffernan. 2020. The automated grading of student open responses in mathematics. In Proceedings of the Tenth International Conference on Learning Analytics & Knowledge. 615–624.Google ScholarDigital Library
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. In Findings of the Association for Computational Linguistics: EMNLP 2020. 1536–1547.Google ScholarCross Ref
Linda Flower and John R Hayes. 1981. A cognitive process theory of writing. College composition and communication 32, 4 (1981), 365–387.Google ScholarCross Ref
Sonja Johnson-Yu, Nicholas Bowman, Mehran Sahami, and Chris Piech. [n. d.]. SimGrade: Using Code Similarity Measures for More Accurate Human Grading. ([n. d.]).Google Scholar
William L Kuechler and Mark G Simkin. 2010. Why is performance on multiple-choice tests and constructed-response tests not more closely related? Theory and an empirical test. Decision Sciences Journal of Innovative Education 8, 1 (2010), 55–73.Google ScholarCross Ref
Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, D Sivakumar, Andrew Tomkins, and Eli Upfal. 2000. Stochastic models for the web graph. In Proceedings 41st Annual Symposium on Foundations of Computer Science. IEEE, 57–65.Google ScholarCross Ref
Andrew S Lan, Divyanshu Vats, Andrew E Waters, and Richard G Baraniuk. 2015. Mathematical language processing: Automatic grading and feedback for open response mathematical questions. In Proceedings of the second (2015) ACM conference on learning@ scale. 167–176.Google ScholarDigital Library
Claudia Leacock and Martin Chodorow. 2003. C-rater: Automated scoring of short-answer questions. Computers and the Humanities 37, 4 (2003), 389–405.Google ScholarCross Ref
Ali Malik, Mike Wu, Vrinda Vasavada, Jinpeng Song, John Mitchell, Noah Goodman, and Chris Piech. 2019. Generative Grading: Neural Approximate Parsing for Automated Student Feedback. arXiv preprint arXiv:1905.09916(2019).Google Scholar
Benoit B Mandelbrot. 2013. Fractals and scaling in finance: Discontinuity, concentration, risk. Selecta volume E. Springer Science & Business Media.Google Scholar
Farshid Marbouti, Heidi A Diefes-Dux, and Krishna Madhavan. 2016. Models for early prediction of at-risk students in a course using standards-based grading. Computers & Education 103 (2016), 1–15.Google ScholarDigital Library
John Mason. 2002. Researching your own practice: The discipline of noticing. Routledge.Google Scholar
Agathe Merceron and Kalina Yacef. 2004. Clustering students to help evaluate learning. In IFIP World Computer Congress, TC 3. Springer, 31–42.Google Scholar
Vera L Miguéis, Ana Freitas, Paulo JV Garcia, and André Silva. 2018. Early segmentation of students according to their academic performance: A predictive modelling approach. Decision Support Systems 115 (2018), 36–51.Google ScholarCross Ref
George A Miller. 1957. Some effects of intermittent silence. The American journal of psychology 70, 2 (1957), 311–314.Google Scholar
Michael Mitzenmacher. 2004. A brief history of generative models for power law and lognormal distributions. Internet mathematics 1, 2 (2004), 226–251.Google Scholar
Thierry Mora and William Bialek. 2011. Are biological systems poised at criticality?Journal of Statistical Physics 144, 2 (2011), 268–302.Google Scholar
Andy Nguyen, Christopher Piech, Jonathan Huang, and Leonidas Guibas. 2014. Codewebs: scalable homework search for massive open online programming courses. In Proceedings of the 23rd international conference on World wide web. 491–502.Google ScholarDigital Library
National Council of Teachers of Mathematics. 2014. Principles to Actions: Ensuring Mathematical Success for All, Author.Google Scholar
Christopher Piech, Ali Malik, Kylie Jue, and Mehran Sahami. 2021. Code in Place: Online Section Leading for Scalable Human-Centered Learning. In Proceedings of the 52nd ACM Technical Symposium on Computer Science Education. 973–979.Google ScholarDigital Library
Chris Piech, Mehran Sahami, Daphne Koller, Steve Cooper, and Paulo Blikstein. 2012. Modeling how students learn to program. In Proceedings of the 43rd ACM technical symposium on Computer Science Education. 153–160.Google ScholarDigital Library
Christopher James Piech. 2016. Uncovering patterns in student work: Machine learning to understand human learning. Stanford University.Google Scholar
Charlie Pilgrim and Thomas T Hills. 2021. Bias in Zipf’s law estimators. Scientific reports 11, 1 (2021), 1–11.Google Scholar
Parikshit Ram and Alexander G Gray. 2011. Density estimation trees. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. 627–635.Google ScholarDigital Library
Brian Riordan, Andrea Horbach, Aoife Cahill, Torsten Zesch, and Chungmin Lee. 2017. Investigating neural architectures for short answer scoring. In Proceedings of the 12th workshop on innovative use of NLP for building educational applications. 159–168.Google ScholarCross Ref
Kelly Rivers and Kenneth R Koedinger. 2014. Automating hint generation with solution space path construction. In International Conference on Intelligent Tutoring Systems. Springer, 329–339.Google ScholarDigital Library
Dale H Schunk. 2012. Learning theories an educational perspective sixth edition. pearson.Google Scholar
David J Schwab, Ilya Nemenman, and Pankaj Mehta. 2014. Zipf’s law and criticality in multivariate data without fine-tuning. Physical review letters 113, 6 (2014), 068102.Google Scholar
Mark D Shermis and Jill Burstein. 2013. Handbook of automated essay evaluation. NY: Routledge (2013).Google ScholarCross Ref
Mark D Shermis and Ben Hamner. 2012. Contrasting state-of-the-art automated scoring of essays: Analysis. In Annual national council on measurement in education meeting. National Council on Measurement in Education Vancouver, BC, Canada, 14–16.Google Scholar
Bernard W Silverman. 1986. Density Estimation for Statistics and Data Analysis. Vol. 26. CRC Press.Google Scholar
Herbert A Simon. 1955. On a class of skew distribution functions. Biometrika 42, 3/4 (1955), 425–440.Google ScholarCross Ref
Arjun Singh, Sergey Karayev, Kevin Gutowski, and Pieter Abbeel. 2017. Gradescope: a fast, flexible, and fair system for scalable assessment of handwritten work. In Proceedings of the fourth (2017) acm conference on learning@ scale. 81–88.Google ScholarDigital Library
Margaret S Smith and Mary Kay Stein. 2018. 5 Practices for Orchestrating Productive Mathematics Discussions. In 5 Practices for Orchestrating Productive Mathematics Discussions. The National Council of Teachers of Mathematics, Inc.Google Scholar
Kurt VanLehn. 1982. Bugs are not enough: Empirical studies of bugs, impasses and repairs in procedural skills.The Journal of Mathematical Behavior(1982).Google Scholar
Carlos J Villagrá-Arnedo, Francisco J Gallego-Durán, Faraón Llorens-Largo, Patricia Compañ-Rosique, Rosana Satorre-Cuerda, and Rafael Molina-Carmona. 2017. Improving the expressiveness of black-box models for predicting student performance. Computers in Human Behavior 72 (2017), 621–631.Google ScholarDigital Library
Hajra Waheed, Saeed-Ul Hassan, Naif Radi Aljohani, Julie Hardman, Salem Alelyani, and Raheel Nawaz. 2020. Predicting academic performance of students from VLE big data using deep learning models. Computers in Human behavior 104 (2020), 106189.Google Scholar
Mike Wu, Milan Mosse, Noah Goodman, and Chris Piech. 2019. Zero shot learning for code education: Rubric sampling with deep learning inference. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 782–790.Google ScholarDigital Library
George Udny Yule. 1925. II.—A mathematical theory of evolution, based on the conclusions of Dr. JC Willis, FR S. Philosophical transactions of the Royal Society of London. Series B, containing papers of a biological character 213, 402-410 (1925), 21–87.Google Scholar

Index Terms

The Student Zipf Theory: Inferring Latent Structures in Open-Ended Student Work To Help Educators
1. Applied computing
  1. Education
    1. Computer-assisted instruction
2. Computing methodologies
  1. Artificial intelligence

Recommendations

Student engagement in massive open online courses

Completion rates in massive open online courses MOOCs are disturbingly low. Existing analysis has focused on patterns of resource access and prediction of drop-out using learning analytics. In contrast, the effectiveness of teaching programs in ...
Read More
Inferring Student Learning Behaviour from Website Interactions: A Usage Analysis

Web-based learning environments are now used extensively as integral components of course delivery in tertiary education. To provide an effective learning environment, it is important that educators understand how these environments are used by their ...
Read More
Modeling Student Learning Styles in MOOCs
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

The recorded student activities in Massive Open Online Course (MOOC) provide us a unique opportunity to model their learning behaviors, identify their particular learning intents, and enable personalized assistance and guidance in online education. In ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
LAK2023: LAK23: 13th International Learning Analytics and Knowledge Conference
March 2023
692 pages
ISBN:9781450398657
DOI:10.1145/3576050
Program Chairs:
Isabel Hilliger
Pontificia Universidad Católica de Chile, Chile
,
Hassan Khosravi
University of Queensland, Australia
,
Bart Rienties
Open University, United Kingdom
,
Shane Dawson
University of South Australia, Australia
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 March 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Constructed Response
Open-Ended Response
Probabilistic Modeling
Student Work
Zipf’s Law
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate236of782submissions,30%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 190
  Total Downloads
- Downloads (Last 12 months)113
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

The Student Zipf Theory: Inferring Latent Structures in Open-Ended Student Work To Help Educators

LAK2023: LAK23: 13th International Learning Analytics and Knowledge Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Student engagement in massive open online courses

Inferring Student Learning Behaviour from Website Interactions: A Usage Analysis

Modeling Student Learning Styles in MOOCs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

The Student Zipf Theory: Inferring Latent Structures in Open-Ended Student Work To Help Educators

LAK2023: LAK23: 13th International Learning Analytics and Knowledge Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Student engagement in massive open online courses

Inferring Student Learning Behaviour from Website Interactions: A Usage Analysis

Modeling Student Learning Styles in MOOCs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media