skip to main content
10.1145/2591062.2591164acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
Article

Comparing test quality measures for assessing student-written tests

Published: 31 May 2014 Publication History

Abstract

Many educators now include software testing activities in programming assignments, so there is a growing demand for appropriate methods of assessing the quality of student-written software tests. While tests can be hand-graded, some educators also use objective performance metrics to assess software tests. The most common measures used at present are code coverage measures—tracking how much of the student’s code (in terms of statements, branches, or some combination) is exercised by the corresponding software tests. Code coverage has limitations, however, and sometimes it overestimates the true quality of the tests. Some researchers have suggested that mutation analysis may provide a better indication of test quality, while some educators have experimented with simply running every student’s test suite against every other student’s program—an “all-pairs” strategy that gives a bit more insight into the quality of the tests. However, it is still unknown which one of these measures is more accurate, in terms of most closely predicting the true bug revealing capability of a given test suite. This paper directly compares all three methods of measuring test quality in terms of how well they predict the observed bug revealing capabilities of student-written tests when run against a naturally occurring collection of student-produced defects. Experimental results show that all-pairs testing—running each student’s tests against every other student’s solution—is the most effective predictor of the underlying bug revealing capability of a test suite. Further, no strong correlation was found between bug revealing capability and either code coverage or mutation analysis scores.

References

[1]
S.H. Edwards. Using software testing to move students from trial-and-error to reflection-in-action. In Proc. 35th SIGCSE Tech. Symp. Comp. Sci. Education, ACM, 2004, pp. 26-30.
[2]
S.H. Edwards. Using test-driven development in the classroom: Providing students with concrete feedback. In Proc. Int'l Conf. Education and Info. Sys.: Technologies and Applications, Int'l Inst. of Informatics and Systemics, 2003, pp. 421–426.
[3]
S.H. Edwards. Rethinking computer science education from a test-first perspective. In Add. 2003 Proc. Conf. Object-oriented Prog., Sys., Languages, and Applications, ACM, 2003, pp. 148–155.
[4]
D. Jackson and M. Usher. Grading student programs using ASSYST. In Pro. 28th SIGCSE Tech. Symp. Comp. Sci. Education, 1997, pp. 335-339.
[5]
J. Spacco and W. Pugh. Helping students appreciate testdriven development (TDD). In Companion to 21st ACM SIGPLAN Symp. Object-oriented Prog. Systems, Languages, and Applications, ACM, 2006, pp. 907-913.
[6]
J.C. Miller and C.J. Maloney. Systematic mistake analysis of digital computer programs. Commun. ACM, vol. 6, pp. 58-63, 1963.
[7]
(10/19/2013). JaCoCo Java Code Coverage Library. Available: http://www.eclemma.org/jacoco/
[8]
(10/19/2013). Clover: Java and Groovy Code Coverage. Available: https://www.atlassian.com/software/clover/overview
[9]
(10/19/2013). EMMA: a free Java code coverage tool. Available: http://emma.sourceforge.net/
[10]
M.H. Goldwasser. A gimmick to integrate software testing throughout the curriculum. In Proc. 33rd SIGCSE Tech. Symp. Comp. Sci. Education, ACM, pp. 271-275, 2002.
[11]
S.H. Edwards, Z. Shams, M. Cogswell, and R.C. Senkbeil. Running students' software tests against each others' code: New life for an old "gimmick". In Proc. 43rd ACM Tech. Symp. Comp. Sci. Education, ACM, 2012, pp. 221-226.
[12]
K. Aaltonen, P. Ihantola, and O. Seppälä. Mutation analysis vs. code coverage in automated assessment of students' testing skills. In Proc. ACM Int'l Conf. Companion on Object-oriented Prog. Sys., Languages, and Applications, ACM, 2010, pp. 153-160.
[13]
R.A. DeMillo, R.J. Lipton, and F.G. Sayward. Hints on test data selection: Help for the practicing programmer. Computer, vol. 11, pp. 34-41, 1978.
[14]
A.J. Offutt. Investigations of the software testing coupling effect. ACM Trans. Softw. Eng. Methodol., vol. 1, pp. 5-20, 1992.
[15]
Z. Shams and S.H. Edwards. Toward practical mutation analysis for evaluating the quality of student-written software tests. In Proc. 9th Ann. Int'l ACM Conf. Comp. Education Research, ACM, 2013, pp. 53-58.
[16]
Y.-S. Ma, J. Offutt, and Y.R. Kwon. MuJava: An automated class mutation system: Research Articles. Softw. Test. Verif. Reliab., vol. 15, pp. 97-133, 2005.
[17]
D. Schuler. (04/15/2013). Javalanche. Available: https://github.com/david-schuler/javalanche/
[18]
P. Ammann and J. Offutt, Introduction to Software Testing, 1 ed.: Cambridge University Press, 2008.

Cited By

View all
  • (2024)Investigating the graphical IEC 61131-3 language impact on test case design and evaluation of mechatronic apprenticesat - Automatisierungstechnik10.1515/auto-2023-016272:3(176-188)Online publication date: 29-Feb-2024
  • (2024)How to Teach Programming in the AI Era? Using LLMs as a Teachable Agent for DebuggingArtificial Intelligence in Education10.1007/978-3-031-64302-6_19(265-279)Online publication date: 2-Jul-2024
  • (2023)A Model of How Students Engineer Test Cases With FeedbackACM Transactions on Computing Education10.1145/362860424:1(1-31)Online publication date: 20-Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE Companion 2014: Companion Proceedings of the 36th International Conference on Software Engineering
May 2014
741 pages
ISBN:9781450327688
DOI:10.1145/2591062
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • TCSE: IEEE Computer Society's Tech. Council on Software Engin.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Software testing
  2. automated assessment
  3. automated grading
  4. mutation testing
  5. programming assignments
  6. test coverage
  7. test metrics
  8. test quality

Qualifiers

  • Article

Conference

ICSE '14
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)40
  • Downloads (Last 6 weeks)3
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Investigating the graphical IEC 61131-3 language impact on test case design and evaluation of mechatronic apprenticesat - Automatisierungstechnik10.1515/auto-2023-016272:3(176-188)Online publication date: 29-Feb-2024
  • (2024)How to Teach Programming in the AI Era? Using LLMs as a Teachable Agent for DebuggingArtificial Intelligence in Education10.1007/978-3-031-64302-6_19(265-279)Online publication date: 2-Jul-2024
  • (2023)A Model of How Students Engineer Test Cases With FeedbackACM Transactions on Computing Education10.1145/362860424:1(1-31)Online publication date: 20-Oct-2023
  • (2023)Systematic Literature Review on Test Case Quality Characteristics and Metrics2023 3rd International Conference on Emerging Smart Technologies and Applications (eSmarTA)10.1109/eSmarTA59349.2023.10293544(01-08)Online publication date: 10-Oct-2023
  • (2022)On the use of mutation analysis for evaluating student test suite qualityProceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3533767.3534217(263-275)Online publication date: 18-Jul-2022
  • (2022)Students vs. professionalsProceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings10.1145/3510454.3517058(294-296)Online publication date: 21-May-2022
  • (2022)Students vs. Professionals: Improving the Learning of Software Testing2022 IEEE/ACM 44th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)10.1109/ICSE-Companion55297.2022.9793734(294-296)Online publication date: May-2022
  • (2022)Automated Assessment in Computer Science: A Bibliometric Analysis of the LiteratureLearning Technologies and Systems10.1007/978-3-031-33023-0_11(122-134)Online publication date: 21-Nov-2022
  • (2021)Mutation testing and self/peer assessmentProceedings of the 43rd International Conference on Software Engineering: Joint Track on Software Engineering Education and Training10.1109/ICSE-SEET52601.2021.00033(231-240)Online publication date: 25-May-2021
  • (2020)Integrating Testing Throughout the CS Curriculum2020 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)10.1109/ICSTW50294.2020.00079(441-444)Online publication date: Oct-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media