Abstract
Many universities have developed Automated Program Assessment Systems to automate the tasks of assessing students’ computer programs so as to enhance students’ learning and relieve instructors’ workload. These systems typically evaluate the correctness of a program by comparing its actual outputs with the instructor’s pre-defined expected outputs. However, an actual output may still be correct even if it deviates from the expected output. One challenge in building such a system is to devise an automated mechanism for determining program output correctness that matches the instructor’s own judgment. This is difficult if instructors have different individual judgments. This paper reports an exploratory empirical study which evaluates instructors’ agreement on the correctness of students’ program outputs. Our study demonstrates reasonably good overall agreement between the instructors and reveals the categories of program output variants for which they are more likely to agree or disagree.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Joy, M., Griffiths, N., Royatt, R.: The BOSS Online Submission and Assessment System. ACM Journal on Educational Resources in Computing 5(3), Article 2 (2005)
Morris, D.S.: Automatic Grading of Student’s Programming Assignments: An Interactive Process and Suite of Programs. In: 33rd ASEE/IEEE Frontiers in Education Conference (FIE 2003), pp. S3F-1–S3F-6 (2003)
Ala-Mutka, K.: A Survey of Automated Assessment Approaches for Programming Assignments. Computer Science Education 15(2), 83–102 (2005)
Jackson, D.: Using Software Tools to Automate the Assessment of Student Programs. Computers and Education 17(2), 133–143 (1991)
Yu, Y.T., Poon, C.K., Choy, M.Y.: Experiences with PASS: Developing and Using a Programming Assignment Assessment System. In: 6th International Conference on Quality Software (QSIC 2006), pp. 360–365 (2006)
Tang, C.M., Yu, Y.T., Poon, C.K.: An Approach towards Automatic Testing of Student Programs Using Token Patterns. In: 17th International Conference on Computers in Education (ICCE 2009), pp. 188–190 (2009)
Howden, W.E.: A Functional Approach to Program Testing and Analysis. IEEE Transactions on Software Engineering 12(10), 997–1005 (1986)
Butcher, P.G., Jordan, S.E.: A Comparison of Human and Computer Marking of Short Free-Text Student Responses. Computers and Education 55(2), 489–499 (2010)
Higgins, C., Hergazy, T., Symeonidis, P., Tsinsifas, A.: The CourseMarker CBA System: Improvements over Ceilidh. Education and Information Technologies 8(3), 287–304 (2003)
Tang, C.M., Yu, Y.T., Poon, C.K.: An Experimental Prototype for Automatically Testing Student Programs using Token Patterns. In: 2nd International Conference on Computer Supported Education (CSEDU 2010), pp. 144–149 (2010)
Tang, C.M., Yu, Y.T., Poon, C.K.: A Review of the Strategies for Output Correctness Determination in Automated Assessment of Student Programs. In: 14th Global Chinese Conference on Computers in Education (GCCCE 2010), pp. 551–558 (2010)
Gwet, K.L.: Handbook of Inter-rater Reliability: The Definitive Guide to Measuring the Extent of Agreement among Multiple Raters, 3rd edn. Advanced Analytics, LLC (2012)
Cohen, J.: A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement 20, 37–46 (1960)
Scott, W.A.: Reliability of Content Analysis: The Case of Nominal Scale Coding. Public Opinion Quarterly 19(3), 321–325 (1955)
Gwet, K.L.: Computing Inter-rater Reliability and Its Variance in the Presence of High Agreement. British Journal of Mathematical and Statistical Psychology 61, 29–48 (2008)
Landis, J.R., Koch, G.G.: The Measurement of Observer Agreement for Categorical Data. Biometrics 33, 159–174 (1977)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tang, C.M., Yu, Y.T. (2013). An Exploratory Study on Instructors’ Agreement on the Correctness of Computer Program Outputs. In: Cheung, S.K.S., Fong, J., Fong, W., Wang, F.L., Kwok, L.F. (eds) Hybrid Learning and Continuing Education. ICHL 2013. Lecture Notes in Computer Science, vol 8038. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39750-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-39750-9_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39749-3
Online ISBN: 978-3-642-39750-9
eBook Packages: Computer ScienceComputer Science (R0)