skip to main content
10.1145/3183440.3195007acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
poster
Public Access

Git blame who?: stylistic authorship attribution of small, incomplete source code fragments

Published:27 May 2018Publication History

ABSTRACT

Program authorship attribution has implications for the privacy of programmers who wish to contribute code anonymously. While previous work has shown that complete files that are individually authored can be attributed, these efforts have focused on ideal data sets such as the Google Code Jam data. We explore the problem of attribution "in the wild," examining source code obtained from open source version control systems, and investigate if and how such contributions can be attributed to their authors, either individually or on a per-account basis. In this work we show that accounts belonging to open source contributors containing short, incomplete, and typically uncompilable fragments can be effectively attributed.

References

  1. Leo Breiman. 2001. Random Forests. Machine Learning (2001). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Steven Burrows. 2010. Source code authorship attribution. Ph.D. Dissertation. RMIT University.Google ScholarGoogle Scholar
  3. Aylin Caliskan-Islam, Richard Harang, Andrew Liu, Arvind Narayanan, Clare Voss, Fabian Yamaguchi, and Rachel Greenstadt. 2015. De-anonymizing programmers via code stylometry. In 24th USENIX Security Symposium (USENIX Security 15). 255--270. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Fabian Yamaguchi, Nico Golde, Daniel Arp, and Konrad Rieck. 2014. Modeling and Discovering Vulnerabilities with Code Property Graphs. In Proc. of IEEE Symposium on Security and Privacy (S&P). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Git blame who?: stylistic authorship attribution of small, incomplete source code fragments

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICSE '18: Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings
      May 2018
      231 pages
      ISBN:9781450356633
      DOI:10.1145/3183440
      • Conference Chair:
      • Michel Chaudron,
      • General Chair:
      • Ivica Crnkovic,
      • Program Chairs:
      • Marsha Chechik,
      • Mark Harman

      Copyright © 2018 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 May 2018

      Check for updates

      Qualifiers

      • poster

      Acceptance Rates

      Overall Acceptance Rate276of1,856submissions,15%

      Upcoming Conference

      ICSE 2025

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader