Abstract
The quality of evaluation of essay type answer books involving multiple evaluators for courses with large number of enrollments is likely to be affected due to heterogeneity in experience, expertise and maturity of evaluators. In this paper, we present a strategy to detect anomalies in evaluation of essay type answers by multiple evaluators based on the relationship between marks/grades awarded and symbolic markers, opinionated words recorded in answer books during evaluation. Our strategy is based on the results of our survey with evaluators, analysis of large number of essay type evaluated answer books and our own experiences regarding grievances of students regarding marks/grades. Results of both survey and analysis of evaluated answer books identified underline, tick and cross as frequently used markers compared to circle and question mark. Further, both opinionated words and symbolic markers identified through the survey are used by evaluators to express either positive or negative sentiments. They have differential usage pattern of these symbols as single evaluator and as one amongst multiple evaluators. Tick and cross have well define purposes and have strong correlation with marks awarded. However, the underline marker is being used for dual purpose of expressing both correctness and incorrectness of answers. Our strategy of inconsistency detection first identifies outliers based on the relationship between marks/grades awarded and number of symbols and/or opinionated words used in evaluation. Subsequently, marks and number of symbolic markers of outliers are compared with peer non-outlier answer books having same marks but different number of markers used. Such outlier answer books are termed as anomalous. We discovered 36 anomalies out of total 425 evaluated answer books. We have developed a prototype tool to facilitate online evaluation of answer book and to proactively alert evaluators of possible anomalies.
Similar content being viewed by others
References
Birenbaum, M. (2007). Assessment and instruction preferences and their relationship with test anxiety and learning strategies. Higher Education, 53, 749–768.
Burstein, J., Leacock, C., & Swartz, R. Automated evaluation of essay and short answers. In M. Danson (Ed.), Proceedings of the Sixth International Computer Assisted Assessment Conference, Loughborough University, Loughbor-ough, UK. (2001).
Cano, M.-D. Student’s involvement in continuous assessment methodologies: A case study for a distributed information systems course. IEEE Transactions on Education, 54(3)
Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: a survey. ACM Computing Surveys, 41(3), 1–58. ISSN: 0360–0300.
Finlayson, D.S. The reliability of the marking of essays. British Journal of Educational Psychology, 21, 126–134. doi:10.1111/j.2044-8279.1951.tb02776.x
Gijbels, D. (2005). The relationship between students approaches to learning and the assessment of learning outcomes. European Journal of Psychology of Education, XX(4), 327–341.
Gijbels, D., & Dochy, F. (2006). Student’s assessment preferences and approaches to learning: can formative assessment make differences? Educational Studies, 32(4), 399–409.
Jain, S., Alkhawajah, A., Larbi, E., Al-Ghamdi, M., & Al-Mustafa, Z. (2005). Evaluation of student performance in written examination in medical pharmacology. Scientific Journal of King Faisal University (Basic and Applied sciences), 6(1), 1426–1435.
Mora, M. C., Sancho-Bru, J. L., Iserte, J. L., & Sanchez, F. T. (2012). An e-assessment approach for evaluation in engineering overcrowded groups. Computers in Education, 59(2), 732–740.
Pai, P., Sanji, N. et al. (2010). Comparative assessment in pharmacology multiple choice questions versus essay with focus on gender differences. Journal of Clinical and Diagnostic Research, 4(4), 2515–2520.
Plimmer, B. (2010). A comparative evaluation of annotation software for grading programming assignment. In the proceeding of Australian User Interface Conference.
Struyven, K., Dochy, F., & Janssens, S. (2005). Student’s perceptions about evaluation and assessment in higher education: a review. Studies in Higher Education, 30(4), 331–347.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shukla, A., Chaudhary, B.D. A strategy for detection of inconsistency in evaluation of essay type answers. Educ Inf Technol 19, 899–912 (2014). https://doi.org/10.1007/s10639-013-9264-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10639-013-9264-x