skip to main content
10.1145/3636555.3636869acmotherconferencesArticle/Chapter ViewAbstractPublication PageslakConference Proceedingsconference-collections
research-article
Open Access

Hierarchical Dependencies in Classroom Settings Influence Algorithmic Bias Metrics

Authors Info & Claims
Published:18 March 2024Publication History

ABSTRACT

Measuring algorithmic bias in machine learning has historically focused on statistical inequalities pertaining to specific groups. However, the most common metrics (i.e., those focused on individual- or group-conditioned error rates) are not currently well-suited to educational settings because they assume that each individual observation is independent from the others. This is not statistically appropriate when studying certain common educational outcomes, because such metrics cannot account for the relationship between students in classrooms or multiple observations per student across an academic year. In this paper, we present novel adaptations of algorithmic bias measurements for regression for both independent and nested data structures. Using hierarchical linear models, we rigorously measure algorithmic bias in a machine learning model of the relationship between student engagement in an intelligent tutoring system and year-end standardized test scores. We conclude that classroom-level influences had a small but significant effect on models. Examining significance with hierarchical linear models helps determine which inequalities in educational settings might be explained by small sample sizes rather than systematic differences.

References

  1. Vincent Aleven and Kenneth R Koedinger. 2001. Investigations into help seeking and learning with a cognitive tutor. In Papers of the AIED-2001 Workshop on Help Provision and Help Seeking in Interactive Learning Environments. Springer Cham, San Antonio, TX, 47–58.Google ScholarGoogle Scholar
  2. Husni Almoubayyed, Stephen E. Fancsali, and Steve Ritter. 2023. Instruction-embedded assessment for reading ability in adaptive mathematics software. In LAK23: 13th International Learning Analytics and Knowledge Conference. ACM, Arlington TX USA, 366–377. https://doi.org/10.1145/3576050.3576105Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Joshi Ambarish, Stephen E. Fancsali, Steven Ritter, Tristan Nixon, and Susan R. Berman. 2014. Generalizing and extending a predictive model for standardized test scores based on Cognitive Tutor interactions. In Proceedings of the 7th International Conference on Educational Data Mining. Educational Data Mining Society (IEDMS), Online, 369–370.Google ScholarGoogle Scholar
  4. Ryan Baker and Adriana de Carvalho. 2008. Labeling student behavior faster and more precisely with text replays. In Proceedings of the 1st International Conference on Educational Data Mining. Educational Data Mining Society (IEDMS), Montréal, Canada, 38–47.Google ScholarGoogle Scholar
  5. Ryan S. Baker. 2023. Big Data and Education. 7th Edition.Google ScholarGoogle Scholar
  6. Ryan Shaun Baker, Albert T. Corbett, Kenneth R. Koedinger, and Angela Z. Wagner. 2004. Off-Task behavior in the Cognitive Tutor classroom: When students “Game the System”. In Proceedings of the SIGCHI conference on Human factors in computing systems, Vol. 6. Association for Computing Machinery, Vienna, Austria, 383–390.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ryan S. Baker and Aaron Hawn. 2021. Algorithmic bias in education. International Journal of Artificial Intelligence in Education 32 (Nov. 2021), 1052–1092. https://doi.org/10.1007/s40593-021-00285-9Google ScholarGoogle ScholarCross RefCross Ref
  8. Solon Barocas, Moritz Hardt, and Arvind Narayanan. 2019. Fairness and Machine Learning. fairmlbook.org, Online. http://www.fairmlbook.orgGoogle ScholarGoogle Scholar
  9. Clara Belitz, Lan Jiang, and Nigel Bosch. 2021. Automating procedurally fair feature selection in machine learning. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. Association for Computing Machinery, New York, NY, 379–389. https://doi.org/10.1145/3461702.3462585Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Richard Berk, Hoda Heidari, Shahin Jabbari, Michael Kearns, and Aaron Roth. 2021. Fairness in criminal justice risk assessments: The state of the art. Sociological Methods & Research 50, 1 (2021), 3–44. https://doi.org/10.1177/0049124118782533Google ScholarGoogle ScholarCross RefCross Ref
  11. Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research 16 (June 2002), 321–357. https://doi.org/10.1613/jair.953Google ScholarGoogle ScholarCross RefCross Ref
  12. Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, San Francisco, CA, USA, 785–794. https://doi.org/10.1145/2939672.2939785Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Albert T. Corbett and John R. Anderson. 1994. Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling and User-Adapted Interaction 4, 4 (Dec. 1994), 253–278. https://doi.org/10.1007/BF01099821 Company: Springer Distributor: Springer Institution: Springer Label: Springer Number: 4 Publisher: Kluwer Academic Publishers.Google ScholarGoogle ScholarCross RefCross Ref
  14. Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference(ITCS ’12). Association for Computing Machinery, New York, NY, USA, 214–226. https://doi.org/10.1145/2090236.2090255Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Stephen E. Fancsali. 2014. Causal discovery with models: Behavior, affect, and learning in Cognitive Tutor Algebra. In Proceedings of the 7th International Conference on Educational Data Mining. Educational Data Mining Society (IEDMS), Online, 28–35.Google ScholarGoogle Scholar
  16. Mingyu Feng, Neil Heffernan, and Kenneth Koedinger. 2009. Addressing the assessment challenge with an online system that tutors as it assesses. User Modeling and User-Adapted Interaction 19, 3 (Aug. 2009), 243–266. https://doi.org/10.1007/s11257-009-9063-7Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Sorelle A. Friedler, Carlos Scheidegger, and Suresh Venkatasubramanian. 2021. The (Im)possibility of fairness: Different value systems require different mechanisms for fair decision making. Commun. ACM 64, 4 (April 2021), 136–143. https://doi.org/10.1145/3433949Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Josh Gardner, Christopher Brooks, and Ryan Baker. 2019. Evaluating the fairness of predictive student models through slicing analysis. In Proceedings of the 9th International Conference on Learning Analytics & Knowledge(LAK19). Association for Computing Machinery, New York, NY, USA, 225–234. https://doi.org/10.1145/3303772.3303791Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Pierre Geurts, Damien Ernst, and Louis Wehenkel. 2006. Extremely randomized trees. Machine Learning 63, 1 (April 2006), 3–42. https://doi.org/10.1007/s10994-006-6226-1Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Neil T. Heffernan and Cristina Lindquist Heffernan. 2014. The ASSISTments ecosystem: Building a platform that brings scientists and teachers together for minimally invasive research on human learning and teaching. International Journal of Artificial Intelligence in Education 24, 4 (Dec. 2014), 470–497. https://doi.org/10.1007/s40593-014-0024-xGoogle ScholarGoogle ScholarCross RefCross Ref
  21. Faisal Kamiran and Toon Calders. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems 33, 1 (Oct. 2012), 1–33. https://doi.org/10.1007/s10115-011-0463-8Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Kenneth R. Koedinger, Elizabeth A. McLaughlin, and Neil T. Heffernan. 2010. A quasi-experimental evaluation of an on-line formative assessment and tutoring system. Journal of Educational Computing Research 43, 4 (Dec. 2010), 489–510. https://doi.org/10.2190/EC.43.4.d Publisher: SAGE Publications Inc.Google ScholarGoogle ScholarCross RefCross Ref
  23. James A. Kulik and J.D. Fletcher. 2016. Effectiveness of intelligent tutoring systems: A meta-analytic review. Review of Educational Research 86, 1 (March 2016), 42–78. http://journals.sagepub.com/doi/full/10.3102/0034654315581420Google ScholarGoogle ScholarCross RefCross Ref
  24. Nathan Levin, Ryan S. Baker, Nidhi Nasiar, Stephen Fancsali, and Stephen Hutt. 2022. Evaluating gaming detector model robustness over time. In Proceedings of the 15th International Conference on Educational Data Mining. International Educational Data Mining Society, Durham, UK, 398–405. https://doi.org/10.5281/ZENODO.6852962 Publisher: Zenodo.Google ScholarGoogle ScholarCross RefCross Ref
  25. Kristian Lum, Yunfeng Zhang, and Amanda Bower. 2022. De-biasing “bias” measurement. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency(FAccT ’22). Association for Computing Machinery, New York, NY, USA, 379–389. https://doi.org/10.1145/3531146.3533105Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2021. A survey on bias and fairness in machine learning. Comput. Surveys 54, 6 (July 2021), 115:1–115:35. https://doi.org/10.1145/3457607Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Roger Nkambou, Jacqueline Bourdeau, Riichiro Mizoguchi, and Janusz Kacprzyk (Eds.). 2010. Advances in Intelligent Tutoring Systems. Studies in Computational Intelligence, Vol. 308. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14363-2Google ScholarGoogle ScholarCross RefCross Ref
  28. Luc Paquette and Ryan S. Baker. 2019. Comparing machine learning to knowledge engineering for student behavior modeling: A case study in gaming the system. Interactive Learning Environments 27, 5-6 (Aug. 2019), 585–597. https://doi.org/10.1080/10494820.2019.1610450Google ScholarGoogle ScholarCross RefCross Ref
  29. Zachary A. Pardos, Ryan S. J. D. Baker, Maria O. C. Z. San Pedro, Sujith M. Gowda, and Supreeth M. Gowda. 2014. Affective states and state tests: Investigating how affect and engagement during the school year predict end-of-year learning outcomes. Journal of Learning Analytics 1, 1 (2014), 107–128. https://eric.ed.gov/?id=EJ1127034 Publisher: Society for Learning Analytics Research ERIC Number: EJ1127034.Google ScholarGoogle ScholarCross RefCross Ref
  30. Zachary A. Pardos, Qing Yang Wang, and Shubhendu Trivedi. 2012. The real world significance of performance prediction. In Proceedings of the 5th International Conference on Educational Data Mining. International Educational Data Mining Society, Chania, Greece, 192–195. https://eric.ed.gov/?id=ED537229 Publication Title: International Educational Data Mining Society ERIC Number: ED537229.Google ScholarGoogle Scholar
  31. Fabian Pedregosa, Gael Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, and David Cournapeau. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 85 (2011), 2825–2830.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Steven Ritter, John R. Anderson, Kenneth R. Koedinger, and Albert Corbett. 2007. Cognitive Tutor: Applied research in mathematics education. Psychonomic Bulletin & Review 14, 2 (April 2007), 249–255. https://doi.org/10.3758/BF03194060Google ScholarGoogle ScholarCross RefCross Ref
  33. Steven Ritter and Stephen E. Fancsali. 2016. MATHia X: The next generation Cognitive Tutor. In Proceedings of the EDM 2016 Workshops and Tutorials. Raleigh, North Carolina, 624–625.Google ScholarGoogle Scholar
  34. Maria O. C. Z. San Pedro, Jaclyn L. Ocumpaugh, Ryan S. Baker, and Neil T. Heffernan. 2014. Predicting STEM and non-STEM college major enrollment from middle school interaction with mathematics educational software. In Proceedings of the 7th International Conference on Educational Data Mining. International Educational Data Mining Society, Online, 276–279.Google ScholarGoogle Scholar
  35. Tom A.B. Snijders and Roel Bosker. 2012. Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling (2nd edition ed.). Sage, Thousand Oaks, CA.Google ScholarGoogle Scholar
  36. Kurt VanLehn. 2011. The relative effectiveness of human tutoring, intelligent tutoring systems, and other tutoring systems. Educational Psychologist 46, 4 (2011), 197–221. http://www.tandfonline.com/doi/abs/10.1080/00461520.2011.611369Google ScholarGoogle ScholarCross RefCross Ref
  37. William J. Webster, Robert L. Mendro, Timothy H. Orsak, and Dash Weerasinghe. 1998. An application of hierarchical linear modeling to the estimation of school and teacher effect. In 1998 Annual Meeting Program. American Educational Research Association, San Diego, CA, 33. https://eric.ed.gov/?id=ED424300 ERIC Number: ED424300.Google ScholarGoogle Scholar

Index Terms

  1. Hierarchical Dependencies in Classroom Settings Influence Algorithmic Bias Metrics

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          LAK '24: Proceedings of the 14th Learning Analytics and Knowledge Conference
          March 2024
          962 pages
          ISBN:9798400716188
          DOI:10.1145/3636555

          Copyright © 2024 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 18 March 2024

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate236of782submissions,30%
        • Article Metrics

          • Downloads (Last 12 months)46
          • Downloads (Last 6 weeks)46

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format