skip to main content
10.1145/2889160.2889204acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Measuring code behavioral similarity for programming and software engineering education

Published: 14 May 2016 Publication History

Abstract

In recent years, online programming and software engineering education via information technology has gained a lot of popularity. Typically, popular courses often have hundreds or thousands of students but only a few course staff members. Tool automation is needed to maintain the quality of education. In this paper, we envision that the capability of quantifying behavioral similarity between programs is helpful for teaching and learning programming and software engineering, and propose three metrics that approximate the computation of behavioral similarity. Specifically, we leverage random testing and dynamic symbolic execution (DSE) to generate test inputs, and run programs on these test inputs to compute metric values of the behavioral similarity. We evaluate our metrics on three real-world data sets from the Pex4Fun platform (which so far has accumulated more than 1.7 million game-play interactions). The results show that our metrics provide highly accurate approximation to the behavioral similarity. We also demonstrate a number of practical applications of our metrics including hint generation, progress indication, and automatic grading.

References

[1]
Pex4Fun. http://www.pex4fun.com/.
[2]
R. Alur, L. D'Antoni, S. Gulwani, D. Kini, and M. Viswanathan. Automated grading of DFA constructions. In Proc. IJCAI, pages 1976--1982, 2013.
[3]
S. Bates and S. Horwitz. Incremental program testing using program dependence graphs. In Proc. POPL, pages 384--396, 1993.
[4]
I. D. Baxter, A. Yahin, L. Moura, M. Sant'Anna, and L. Bier. Clone detection using abstract syntax trees. In Proc. ICSM, pages 368--377, 1998.
[5]
D. Binkley. Using semantic differencing to reduce the cost of regression testing. In Proc. ICSM, pages 41--50. 1992.
[6]
J. Bishop, R. N. Horspool, T. Xie, N. Tillmann, and J. de Halleux. Code Hunt: Experience with coding contests at scale. Proc. IGSE, JSEET, pages 398--407, 2015.
[7]
T. Y. Chen, F.-C. Kuo, R. G. Merkel, and T. H. Tse. Adaptive random testing: The art of test case diversity. J. Syst. Softw., 83(1):60--66, 2010.
[8]
Y. Dang, D. Zhang, S. Ge, C. Chu, Y. Qiu, and T. Xie. XIAO: Tuning code clones at hands of engineers in practice. In Proc. ACSAC, pages 369--378, 2012.
[9]
L. M. de Moura and N. Bjørner. Z3: An efficient SMT solver. In Proc. TACAS, pages 337--340, 2008.
[10]
J. Geldenhuys, M. B. Dwyer, and W. Visser. Probabilistic symbolic execution. In Proc. ISSTA, pages 166--176, 2012.
[11]
P. Godefroid, N. Klarlund, and K. Sen. DART: Directed automated random testing. In Proc. PLDI, pages 213--223, 2005.
[12]
P. Godefroid, M. Y. Levin, and D. A. Molnar. Automated whitebox fuzz testing. In Proc. NDSS, pages 151--166, 2008.
[13]
J. B. Hext and J. W. Winings. An automatic grading scheme for simple programming exercises. Commun. ACM, 12(5):272--275, 1969.
[14]
D. Jackson and D. A. Ladd. Semantic Diff: A tool for summarizing the effects of modifications. In Proc. ICSM, pages 243--252, 1994.
[15]
D. Jackson and M. Usher. Grading student programs using ASSYST. In Proc. SIGCSE, pages 335--339, 1997.
[16]
L. Jiang and Z. Su. Automatic mining of functionally equivalent code fragments via random testing. In Proc. ISSTA, pages 81--92, 2009.
[17]
T. Kamiya, S. Kusumoto, and K. Inoue. CCFinder: A multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng., 28(7):654--670, 2002.
[18]
R. Komondoor and S. Horwitz. Using slicing to identify duplication in source code. In Proc. SAS, pages 40--56, 2001.
[19]
S. K. Lahiri, C. Hawblitzel, M. Kawaguchi, and H. Rebêlo. SymDiff: A language-agnostic semantic diff tool for imperative programs. In Proc. CAV, pages 712--717, 2012.
[20]
E. Merlo, G. Antoniol, M. Di Penta, and V. F. Rollo. Linear complexity object-oriented similarity for clone detection and software evolution analyses. In Proc. ICSM, pages 412--416, 2004.
[21]
C. Pacheco, S. K. Lahiri, M. D. Ernst, and T. Ball. Feedback-directed random test generation. In Proc. ICSE, pages 75--84, 2007.
[22]
S. Person, M. B. Dwyer, S. Elbaum, and C. S. Päsäreanu. Differential symbolic execution. In Proc. FSE, pages 226--237, 2008.
[23]
K. Sen, D. Marinov, and G. Agha. CUTE: A concolic unit testing engine for C. In Proc. ESEC/FSE, pages 263--272, 2005.
[24]
R. Singh, S. Gulwani, and A. Solar-Lezama. Automated feedback generation for introductory programming assignments. In Proc. PLDI, pages 15--26, 2013.
[25]
K. Taneja and T. Xie. DiffGen: Automated regression unit-test generation. In Proc. ASE, pages 407--410, 2008.
[26]
N. Tillmann and J. de Halleux. Pex-White box test generation for. NET. In Proc. TAP, pages 134--153, 2008.
[27]
N. Tillmann, J. de Halleux, and T. Xie. Transferring an automated test generation tool to practice: From Pex to Fakes and Code Digger. In Proc. ASE, pages 385--396, 2014.
[28]
N. Tillmann, J. De Halleux, T. Xie, S. Gulwani, and J. Bishop. Teaching and learning programming and software engineering via interactive gaming. In Proc. ICSE, SEE, pages 1117--1126, 2013.
[29]
T. Wang, X. Su, Y. Wang, and P. Ma. Semantic similarity-based grading of student programs. Inf. Softw. Technol., 49(2):99--107, 2007.
[30]
X. Xiao, S. Li, T. Xie, and N. Tillmann. Characteristic studies of loop problems for structural test generation via symbolic execution. In Proc. ASE, pages 246--256, 2013.

Cited By

View all
  • (2024)Automating the correctness assessment of AI-generated code for security contextsJournal of Systems and Software10.1016/j.jss.2024.112113216:COnline publication date: 1-Oct-2024
  • (2023)Big Code Search: A BibliographyACM Computing Surveys10.1145/360490556:1(1-49)Online publication date: 26-Aug-2023
  • (2023)Automated Property Directed Self CompositionAutomated Technology for Verification and Analysis10.1007/978-3-031-45332-8_7(139-158)Online publication date: 19-Oct-2023
  • Show More Cited By
  1. Measuring code behavioral similarity for programming and software engineering education

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICSE '16: Proceedings of the 38th International Conference on Software Engineering Companion
    May 2016
    946 pages
    ISBN:9781450342056
    DOI:10.1145/2889160
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 May 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Conference

    ICSE '16
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 276 of 1,856 submissions, 15%

    Upcoming Conference

    ICSE 2025

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 08 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Automating the correctness assessment of AI-generated code for security contextsJournal of Systems and Software10.1016/j.jss.2024.112113216:COnline publication date: 1-Oct-2024
    • (2023)Big Code Search: A BibliographyACM Computing Surveys10.1145/360490556:1(1-49)Online publication date: 26-Aug-2023
    • (2023)Automated Property Directed Self CompositionAutomated Technology for Verification and Analysis10.1007/978-3-031-45332-8_7(139-158)Online publication date: 19-Oct-2023
    • (2021)What You See is What it Means! Semantic Representation Learning of Code based on Visualization and Transfer LearningACM Transactions on Software Engineering and Methodology10.1145/348513531:2(1-34)Online publication date: 24-Dec-2021
    • (2021)A Systematic Review of the Effects of Automatic Scoring and Automatic Feedback in Educational SettingsIEEE Access10.1109/ACCESS.2021.31008909(108190-108198)Online publication date: 2021
    • (2021)Academic Source Code Plagiarism Detection by Measuring Program Behavioral SimilarityIEEE Access10.1109/ACCESS.2021.30693679(50391-50412)Online publication date: 2021
    • (2020)Automatic Grading for Complex Multifile ProgramsComplexity10.1155/2020/32790532020Online publication date: 1-Jan-2020
    • (2020)Building a Comprehensive Automated Programming Assessment SystemIEEE Access10.1109/ACCESS.2020.29909808(81154-81172)Online publication date: 2020
    • (2019)SeSaMeProceedings of the 16th International Conference on Mining Software Repositories10.1109/MSR.2019.00079(529-533)Online publication date: 26-May-2019
    • (2019)MAFProceedings of the 41st International Conference on Software Engineering: Software Engineering Education and Training10.1109/ICSE-SEET.2019.00020(110-120)Online publication date: 27-May-2019
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media