skip to main content
10.1145/2597008.2597143acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
Article

Plagiarism detection for multithreaded software based on thread-aware software birthmarks

Published: 02 June 2014 Publication History

Abstract

The availability of inexpensive multicore hardware presents a turning point in software development. In order to benefit from the continued exponential throughput advances in new processors, the software applications must be multithreaded programs. As multithreaded programs become increasingly popular, plagiarism of multithreaded programs starts to plague the software industry. Although there has been tremendous progress on software plagiarism detection technology, existing dynamic approaches remain optimized for sequential programs and cannot be applied to multithreaded programs without significant redesign. This paper fills the gap by presenting two dynamic birthmark based approaches. The first approach extracts key instructions while the second approach extracts system calls. Both approaches consider the effect of thread scheduling on computing software birthmarks. We have implemented a prototype based on the Pin instrumentation framework. Our empirical study shows that the proposed approaches can effectively detect plagiarism of multithread programs and exhibit strong resilience to various semantic-preserving code obfuscations.

References

[1]
At4J library. http://www.at4j.org/download.php.
[2]
Allatori obfuscator. http://www.allatori.com/.
[3]
Chae D K, Ha J, Kim S W, et al. Software plagiarism detection: a graph-based approach{C}.In: CIKM 2013.ACM 2013, 1577-1580.
[4]
Chan P, Lucas C K. Heap Graph Based Software Theft Detection{J}. IEEE Transactions on Information Forensics and Security, 2013.
[5]
Choi S, Park H, et al. A static API birthmark for Windows binary executables{J}. Journal of Systems and Software. 2009, 82(5): 862-873.
[6]
Collberg C, Carter E, Debray S, et al. Dynamic path-based software watermarking{C}. In: PLDI '04.New York, NY, USA: ACM, 2004.
[7]
Collberg C, Myles G R, Huntwork A. Sandmark-a tool for software protection research{J}. Security & Privacy, IEEE. 2003, 1(4): 40-49.
[8]
Fukuda K, Tamada H. A Dynamic Birthmark from Analyzing Operand Stack Runtime Behavior to Detect Copied Software{C}. In: SNPD '13. IEEE, 2013: 505-510.
[9]
Jhi Y, Wang X, Jia X, et al. Value-based program characterization and its application to software plagiarism detection{C}. In: ICSE '11.New York, NY, USA: ACM, 2011. 756-765.
[10]
Ji J, Woo G, Cho H. A source code linearization technique for detecting plagiarized programs{J}. SIGCSE Bull.2007.
[11]
Lim H I, Taisook H A N. Analyzing Stack Flows to Compare Java Programs{J}. IEICE TRANSACTIONS on Information and Systems, 2012, 95(2): 565-576.
[12]
Lim H, Park H, Choi S, et al. A method for detecting the theft of Java programs through analysis of the control flow information{J}. Information and Software Technology, 2009, 51(9): 1338-1350.
[13]
Liu C, Chen C, et al. GPLAG: detection of software plagiarism by program dependence graph analysis{C}. In: KDD, 2006. 872-881.
[14]
Luk C, Cohn R, Muth R, et al. Pin: building customized program analysis tools with dynamic instrumentation{C}. In: PLDI '05.New York, NY, USA: 2005.
[15]
Mcmillan C, Grechanik M, Poshyvanyk D. Detecting similar software applications{C}. In: ICSE 2012.Piscataway, NJ, USA: IEEE Press, 2012. 364-374.
[16]
G. Myles and C. Collberg. Detecting software theft via whole program path birthmarks, in Proc. Inf. Security 7th Int. Conf. (ISC 2004),Palo Alto, CA, Sep. 27–29, 2004, pp. 404–415.
[17]
Myles G, Collberg C. K-gram based software birthmarks{C}. In: SAC '05. New York, NY, USA: ACM, 2005. 314-318.
[18]
Rechelt L, Malpohl G, Philippsen M. Finding plagiarisms among a set of programs with JPlag{J}. Journal of universal computer science, 2002, 8(11): 1016-1038.
[19]
Schuler D, Dallmeier V, Lindig C. A dynamic birthmark for java{C}. In: ASE '07.New York, NY, USA: ACM, 2007. 27.
[20]
Tamada H, Okamoto K, et al. Dynamic software birthmarks to detect the theft of windows applications{C}. In International Symposium on Future Software Technology. Xian, China, 2004.
[21]
Tian Z, Zheng Q, Liu T, et al. DKISB: Dynamic Key Instruction Sequence Birthmark for Software Plagiarism Detection{C}. In: HPCC'13.Zhang Jia Jie, Hu Nan: IEEE, 2013.
[22]
Wang X, Jhi Y, Zhu S, et al. Behavior based software theft detection{C}. In: CCS '09.New York, NY, USA: ACM, 2009. 280-290.
[23]
Wang X, Jhi Y, Zhu S, et al. Detecting Software Theft via System Call Based Birthmarks{C}.In:ACSAC'09.Washington, DC, USA: IEEE Computer Society, 2009. 149-158.
[24]
Zhang X, Gupta R. Whole execution traces{C}. Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2004: 105-116.
[25]
Zhang F, Jhi Y, Wu D, et al. A first step towards algorithm plagiarism detection{C}. In: ISSTA 2012.New York, NY, USA: ACM, 2012. 111-12.

Cited By

View all
  • (2024)Software Plagiarism FinderInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-19453(535-543)Online publication date: 29-Aug-2024
  • (2023)An Overview on the Identification of Software Birthmarks for Software ProtectionProceedings of International Conference on Information Technology and Applications10.1007/978-981-19-9331-2_27(323-330)Online publication date: 19-May-2023
  • (2021)Software Birthmark Usability for Source Code Transformation Using Machine Learning AlgorithmsScientific Programming10.1155/2021/55477662021Online publication date: 1-Jan-2021
  • Show More Cited By

Index Terms

  1. Plagiarism detection for multithreaded software based on thread-aware software birthmarks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICPC 2014: Proceedings of the 22nd International Conference on Program Comprehension
    June 2014
    325 pages
    ISBN:9781450328791
    DOI:10.1145/2597008
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    • TCSE: IEEE Computer Society's Tech. Council on Software Engin.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 June 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Multithreaded Program
    2. Plagiarism Detection
    3. Software Birthmark

    Qualifiers

    • Article

    Conference

    ICSE '14
    Sponsor:

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Software Plagiarism FinderInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-19453(535-543)Online publication date: 29-Aug-2024
    • (2023)An Overview on the Identification of Software Birthmarks for Software ProtectionProceedings of International Conference on Information Technology and Applications10.1007/978-981-19-9331-2_27(323-330)Online publication date: 19-May-2023
    • (2021)Software Birthmark Usability for Source Code Transformation Using Machine Learning AlgorithmsScientific Programming10.1155/2021/55477662021Online publication date: 1-Jan-2021
    • (2021)Plagiarism Detection of Multi-threaded Programs Using Frequent Behavioral Pattern MiningInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402040025230:11n12(1667-1688)Online publication date: 21-Jan-2021
    • (2020)Modelling Features-Based Birthmarks for Security of End-to-End Communication SystemSecurity and Communication Networks10.1155/2020/88521242020Online publication date: 1-Jan-2020
    • (2020)Revisiting the Challenges and Opportunities in Software Plagiarism Detection2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER48275.2020.9054847(537-541)Online publication date: Feb-2020
    • (2020)Plagiarism Detection of Multi-Threaded Programs via Siamese Neural NetworksIEEE Access10.1109/ACCESS.2020.30211848(160802-160814)Online publication date: 2020
    • (2019)Birthmark based identification of software piracy using Haar waveletMathematics and Computers in Simulation10.1016/j.matcom.2019.04.010Online publication date: May-2019
    • (2019)Software Birthmark Design and Estimation: A Systematic Literature ReviewArabian Journal for Science and Engineering10.1007/s13369-019-03718-9Online publication date: 16-Jan-2019
    • (2018)Reviving Sequential Program Birthmarking for Multithreaded Software Plagiarism DetectionIEEE Transactions on Software Engineering10.1109/TSE.2017.268838344:5(491-511)Online publication date: 1-May-2018
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media