Skip to main content
Log in

A strategy for detection of inconsistency in evaluation of essay type answers

  • Published:
Education and Information Technologies Aims and scope Submit manuscript

Abstract

The quality of evaluation of essay type answer books involving multiple evaluators for courses with large number of enrollments is likely to be affected due to heterogeneity in experience, expertise and maturity of evaluators. In this paper, we present a strategy to detect anomalies in evaluation of essay type answers by multiple evaluators based on the relationship between marks/grades awarded and symbolic markers, opinionated words recorded in answer books during evaluation. Our strategy is based on the results of our survey with evaluators, analysis of large number of essay type evaluated answer books and our own experiences regarding grievances of students regarding marks/grades. Results of both survey and analysis of evaluated answer books identified underline, tick and cross as frequently used markers compared to circle and question mark. Further, both opinionated words and symbolic markers identified through the survey are used by evaluators to express either positive or negative sentiments. They have differential usage pattern of these symbols as single evaluator and as one amongst multiple evaluators. Tick and cross have well define purposes and have strong correlation with marks awarded. However, the underline marker is being used for dual purpose of expressing both correctness and incorrectness of answers. Our strategy of inconsistency detection first identifies outliers based on the relationship between marks/grades awarded and number of symbols and/or opinionated words used in evaluation. Subsequently, marks and number of symbolic markers of outliers are compared with peer non-outlier answer books having same marks but different number of markers used. Such outlier answer books are termed as anomalous. We discovered 36 anomalies out of total 425 evaluated answer books. We have developed a prototype tool to facilitate online evaluation of answer book and to proactively alert evaluators of possible anomalies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Birenbaum, M. (2007). Assessment and instruction preferences and their relationship with test anxiety and learning strategies. Higher Education, 53, 749–768.

    Article  Google Scholar 

  • Burstein, J., Leacock, C., & Swartz, R. Automated evaluation of essay and short answers. In M. Danson (Ed.), Proceedings of the Sixth International Computer Assisted Assessment Conference, Loughborough University, Loughbor-ough, UK. (2001).

  • Cano, M.-D. Student’s involvement in continuous assessment methodologies: A case study for a distributed information systems course. IEEE Transactions on Education, 54(3)

  • Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: a survey. ACM Computing Surveys, 41(3), 1–58. ISSN: 0360–0300.

    Article  Google Scholar 

  • Finlayson, D.S. The reliability of the marking of essays. British Journal of Educational Psychology, 21, 126–134. doi:10.1111/j.2044-8279.1951.tb02776.x

  • Gijbels, D. (2005). The relationship between students approaches to learning and the assessment of learning outcomes. European Journal of Psychology of Education, XX(4), 327–341.

    Article  Google Scholar 

  • Gijbels, D., & Dochy, F. (2006). Student’s assessment preferences and approaches to learning: can formative assessment make differences? Educational Studies, 32(4), 399–409.

    Article  Google Scholar 

  • Jain, S., Alkhawajah, A., Larbi, E., Al-Ghamdi, M., & Al-Mustafa, Z. (2005). Evaluation of student performance in written examination in medical pharmacology. Scientific Journal of King Faisal University (Basic and Applied sciences), 6(1), 1426–1435.

    Google Scholar 

  • Mora, M. C., Sancho-Bru, J. L., Iserte, J. L., & Sanchez, F. T. (2012). An e-assessment approach for evaluation in engineering overcrowded groups. Computers in Education, 59(2), 732–740.

    Article  Google Scholar 

  • Pai, P., Sanji, N. et al. (2010). Comparative assessment in pharmacology multiple choice questions versus essay with focus on gender differences. Journal of Clinical and Diagnostic Research, 4(4), 2515–2520.

    Google Scholar 

  • Plimmer, B. (2010). A comparative evaluation of annotation software for grading programming assignment. In the proceeding of Australian User Interface Conference.

  • Struyven, K., Dochy, F., & Janssens, S. (2005). Student’s perceptions about evaluation and assessment in higher education: a review. Studies in Higher Education, 30(4), 331–347.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Archana Shukla.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shukla, A., Chaudhary, B.D. A strategy for detection of inconsistency in evaluation of essay type answers. Educ Inf Technol 19, 899–912 (2014). https://doi.org/10.1007/s10639-013-9264-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10639-013-9264-x

Keywords

Navigation