Skip to main content

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1831))

Included in the following conference series:

  • 4517 Accesses

Abstract

Automated essay scoring and short-answer scoring have shown tremendous potential for enhancing and promoting large-scale assessments. Challenges still remain in the equity and the implicit bias in scoring that is ingrained in the scoring system. One promising solution to mitigate the problem is the introduction of a measurement model that quantifies and evaluates the degree to which the raters are showing biases in students’ written responses. This paper presents an adoption of a generalized many-facet Rasch model (GMFRM, [8]) to evaluate the rater biases regarding students’ writing styles. We modeled students’ writing styles using a rich set of computational linguistic indices from LingFeat [7] that are empirically and theoretically associated with the text difficulty. The findings showed that the rater bias exists in scoring responses that are explicitly covering a variety of topics with less elaborative descriptions compared to the other writing styles captured in students’ responses. The rater severity was the only type of bias characteristic that was statistically significant. We discussed the implications and future applications of our findings to advance automated essay scoring and teacher professional development.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ercikan, K., Guo, H., He, Q.: Use of response process data to inform group comparisons and fairness research. Educ. Assess. 25(3), 179–197 (2020)

    Article  Google Scholar 

  2. Amorim, E., Cançado, M., Veloso, A.: Automated essay scoring in the presence of biased ratings. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, Long Papers, pp. 229–237 (2018)

    Google Scholar 

  3. Stecher, B.M., et al.: The effects of content, format, and inquiry level on science performance assessment scores. Appl. Measur. Educ. 13(2), 139–160 (2000)

    Article  Google Scholar 

  4. Ahmadi Shirazi, M.: For a greater good: bias analysis in writing assessment. SAGE Open 9(1), 2158244018822377 (2019)

    Article  Google Scholar 

  5. Mohd Noh, M.F., Mohd Matore, M.E.E.: Rater severity differences in English language as a second language speaking assessment based on rating experience, training experience, and teaching experience through many-faceted Rasch measurement analysis. Front. Psychol. 13, 941084 (2022)

    Article  Google Scholar 

  6. Uto, M., Okano, M.: Learning automated essay scoring models using item-response-theory-based scores to decrease effects of rater biases. IEEE Trans. Learn. Technol. 14(6), 763–776 (2021)

    Article  Google Scholar 

  7. Lee, B.W., Jang, Y.S., Lee, J.H.J.: Pushing on text readability assessment: a transformer meets handcrafted linguistic features (2021). arXiv preprint arXiv:2109.12258

  8. Uto, M., Ueno, M.: A generalized many-facet Rasch model and its Bayesian estimation using Hamiltonian Monte Carlo. Behaviormetrika 47(2), 469–496 (2020). https://doi.org/10.1007/s41237-020-00115-7

    Article  Google Scholar 

  9. Stan Development Team. “RStan: the R interface to Stan.” R package version 2.21.8 (2023). https://mc-stan.org/

  10. R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2022). https://www.R-project.org/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinnie Shin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shin, J., Jing, Z., Lipien, L., Fleetwood, A., Leite, W. (2023). Evaluating the Rater Bias in Response Scoring in Digital Learning Platform: Analysis of Student Writing Styles. In: Wang, N., Rebolledo-Mendez, G., Dimitrova, V., Matsuda, N., Santos, O.C. (eds) Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky. AIED 2023. Communications in Computer and Information Science, vol 1831. Springer, Cham. https://doi.org/10.1007/978-3-031-36336-8_80

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36336-8_80

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36335-1

  • Online ISBN: 978-3-031-36336-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics