skip to main content
10.1145/3427228.3427258acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacsacConference Proceedingsconference-collections
research-article

Understanding Promotion-as-a-Service on GitHub

Published: 08 December 2020 Publication History

Abstract

As the world’s leading software development platform, GitHub has become a social networking site for programmers and recruiters who leverage its social features, such as star and fork, for career and business development. However, in this paper, we found a group of GitHub accounts that conducted promotion services in GitHub, called “promoters”, by performing paid star and fork operations on specified repositories. We also uncovered a stealthy way of tampering with historical commits, through which these promoters are able to fake commits retroactively. By exploiting such a promotion service, any GitHub user can pretend to be a skillful developer with high influence.
To understand promotion services in GitHub, we first investigated the underground promotion market of GitHub and identified 1,023 suspected promotion accounts from the market. Then, we developed an SVM (Support Vector Machine) classifier to detect promotion accounts from all active users extracted from GH Archive ranging from 2015 to 2019. In total, we detected 63,872 suspected promotion accounts. We further analyzed these suspected promotion accounts, showing that (1) a hidden functionality in GitHub is abused to boost the reputation of an account by forging historical commits and (2) a group of small businesses exploit GitHub promotion services to promote their products. We estimated that suspicious promoters could have made a profit of $3.41 million and $4.37 million in 2018 and 2019, respectively.

References

[1]
api.github.com. 2019. GitHub API Interface. https://api.github.com/.
[2]
Prudhvi Ratna Badri Satya, Kyumin Lee, Dongwon Lee, Thanh Tran, and Jason Zhang. 2016. Uncovering Fake Likers in Online Social Networks. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management.
[3]
brunch.co.kr. 2019. SKT-Github-Abuse. https://brunch.co.kr/@supims/595.
[4]
Qingpeng Cai, Aris Filos-Ratsikas, Pingzhong Tang, and Yiwei Zhang. 2018. Reinforcement Mechanism Design for Fraudulent Behaviour in E-commerce. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.
[5]
Hao Chen, Daojing He, Sencun Zhu, and Jingshun Yang. 2017. Toward Detecting Collusive Ranking Manipulation Attackers in Mobile App Markets. In Proceedings of the ACM on Asia Conference on Computer and Communications Security.
[6]
Laura Dabbish, Colleen Stuart, Jason Tsay, and Jim Herbsleb. 2012. Social Coding in GitHub: Transparency and Collaboration in an Open Software Repository. In Proceedings of the ACM Conference on Computer Supported Cooperative Work.
[7]
Emiliano De Cristofaro, Arik Friedman, Guillaume Jourjon, Mohamed Ali Kaafar, and M Zubair Shafiq. 2014. Paying for Likes?: Understanding Facebook Like Fraud Using Honeypots. In Proceedings of the ACM Internet Measurement Conference.
[8]
Giuseppe Destefanis, Marco Ortu, David Bowes, Michele Marchesi, and Roberto Tonelli. 2018. On Measuring Affects of Github Issues’ Commenters. In Proceedings of the 3rd International Workshop on Emotion Awareness in Software Engineering. ACM.
[9]
devskiller.com. 2019. Devskiller. https://devskiller.com/.
[10]
Kun Du, Hao Yang, Zhou Li, Hai-Xin Duan, and Kehuan Zhang. 2016. The Ever-Changing Labyrinth: A Large-Scale Analysis of Wildcard DNS Powered Blackhat SEO. In USENIX Security Symposium.
[11]
Manuel Egele, Gianluca Stringhini, Christopher Kruegel, and Giovanni Vigna. 2013. COMPA: Detecting Compromised Accounts on Social Networks. In Proceedings of the Symposium on Network and Distributed System Security.
[12]
farsightsecurity.com. 2019. Passive DNS historical internet database: Farsight DNSDB. https://www.farsightsecurity.com/solutions/dnsdb/.
[13]
freecodecamp.org. 2016. How I Got 1,000 Stars on My GitHub Project, and the Lessons Learned Along the Way. https://medium.freecodecamp.org/how-i-got-1000–on-my-github-project-654d3d394ca6.
[14]
gharchive.org. 2019. GH Archive. https://www.gharchive.org/.
[15]
gimhub.com. 2019. GimHub - Buy GitHub Stars and Followers. https://gimhub.com/.
[16]
github.com. 2018. Abusing Github commit history for the lulz. https://github.com/gelstudios/gitfiti.
[17]
github.com. 2018. Projects | The State of the Octoverse. https://octoverse.github.com/projects.
[18]
github.com. 2018. The State of the Octoverse | The State of the Octoverse celebrates a year of building across teams, time zones, and millions of merged pull requests.https://octoverse.github.com/.
[19]
github.com. 2019. GitHub - torvalds/linux: Linux kernel source tree. https://github.com/torvalds/linux.
[20]
github.com. 2019. GitHub Report Abuse. https://github.com/contact/report-abuse/.
[21]
github.com. 2019. Stop abuse GitHub Star metatron-app/metatron-discovery · GitHub. https://github.com/metatron-app/metatron-discovery/issues/2405.
[22]
github.com. 2019. The world leading software development platform GitHub. https://github.com/.
[23]
gitstar.org. 2018. GitStar. http://218.241.135.34:88/.
[24]
Chris Grier, Lucas Ballard, Juan Caballero, Neha Chachra, Christian J. Dietrich, Kirill Levchenko, Panayiotis Mavrommatis, Damon McCoy, Antonio Nappa, Andreas Pitsillidis, Niels Provos, M. Zubair Rafique, Moheeb Abu Rajab, Christian Rossow, Kurt Thomas, Vern Paxson, Stefan Savage, and Geoffrey M. Voelker. 2012. Manufacturing Compromise: The Emergence of Exploit-as-a-service. In Proceedings of the ACM Conference on Computer and Communications Security.
[25]
Muhammad Ikram, Lucky Onwuzurike, Shehroze Farooqi, Emiliano De Cristofaro, Arik Friedman, Guillaume Jourjon, Mohammed Ali Kaafar, and M Zubair Shafiq. 2017. Measuring, Characterizing, and Detecting Facebook Like Farms. ACM Transactions on Privacy and Security(2017).
[26]
Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2016. Catching Synchronized Behaviors in Large Networks: A Graph Mining Approach. ACM Transactions on Knowledge Discovery from Data (2016).
[27]
Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M German, and Daniela Damian. 2014. The Promises and Perils of Mining GitHub. In Proceedings of the 11th Working Conference on Mining Software Repositories. ACM.
[28]
Chris Kanich, Nicholas Weaver, Damon McCoy, Tristan Halvorson, Christian Kreibich, Kirill Levchenko, Vern Paxson, Geoffrey M Voelker, and Stefan Savage. 2011. Show Me the Money: Characterizing Spam-advertised Revenue. In USENIX Security Symposium.
[29]
Paul M Leonardi. 2014. Social media, Knowledge sharing, and Innovation: Toward a Theory of Communication Visibility. Information Systems Research(2014).
[30]
Kirill Levchenko, Andreas Pitsillidis, Neha Chachra, Brandon Enright, Márk Félegyházi, Chris Grier, Tristan Halvorson, Chris Kanich, Christian Kreibich, He Liu, Damon McCoy, Nicholas Weaver, Vern Paxson, Geoffrey M. Voelker, and Stefan Savage. 2011. Click Trajectories: End-to-End Analysis of the Spam Value Chain. In Proceedings of the IEEE Symposium on Security and Privacy.
[31]
Jennifer Marlow, Laura Dabbish, and Jim Herbsleb. 2013. Impression Formation in Online Peer Production: Activity Traces and Personal Profiles in Github. In Proceedings of the Conference on Computer Supported Cooperative Work. ACM.
[32]
oschina.net. 2018. Github’s fake industry chain is exposed. You can buy Stars when you spend money (Translated from Chinese). https://www.oschina.net/news/99612/fake-star-on-github?from=20180909.
[33]
pillow.readthedocs.io. 2018. Pillow. https://pillow.readthedocs.io/en/stable/.
[34]
Jinglei Ren, Hezheng Yin, Qingda Hu, Armando Fox, and Wojciech Koszek. 2018. Towards Quantifying the Development Value of Code Contributions. In Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering.
[35]
smexpt.com. 2018. SMEXPT | Github Stars. https://www.smexpt.com/shop/github-stars/.
[36]
Jonghyuk Song, Sangho Lee, and Jong Kim. 2015. Crowdtarget: Target-Based Detection of Crowdturfing in Online Social Networks. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security.
[37]
Gianluca Stringhini, Manuel Egele, Christopher Kruegel, and Giovanni Vigna. 2012. Poultry Markets: On the Underground Economy of Twitter Followers. ACM SIGCOMM Computer Communication Review(2012).
[38]
theregister.co.uk. 2019. Drinks-for-stars. https://www.theregister.co.uk/2019/07/30/would_you_star_a_github_project_for_a_free_drink/.
[39]
Kurt Thomas, Chris Grier, Dawn Song, and Vern Paxson. 2011. Suspended Accounts in Retrospect: An analysis of Twitter Spam. In Proceedings of the ACM Internet Measurement Conference.
[40]
Ferdian Thung, Tegawende F Bissyande, David Lo, and Lingxiao Jiang. 2013. Network Structure of Social Coding in GitHub. In Proceedings of the 17th European Conference on Software Maintenance and Reengineering. IEEE.
[41]
Jason Tsay, Laura Dabbish, and James Herbsleb. 2014. Influence of Social and Technical Factors for Evaluating Contribution in GitHub. In Proceedings of the 36th International Conference on Software Engineering. ACM.
[42]
Gang Wang, Tianyi Wang, Haitao Zheng, and Ben Y Zhao. 2014. Man vs. Machine: Practical Adversarial Detection of Malicious Crowdsourcing Workers. In USENIX Security Symposium.
[43]
Mairieli Wessel, Bruno Mendes de Souza, Igor Steinmacher, Igor S. Wiese, Ivanilton Polato, Ana Paula Chaves, and Marco A. Gerosa. 2018. The Power of Bots: Characterizing and Understanding Bots in OSS Projects. Proc. ACM Hum.-Comput. Interact.(2018).
[44]
Zhen Xie and Sencun Zhu. 2015. AppWatcher: Unveiling the Underground Market of Trading Mobile App Reviews. In Proceedings of the 8th ACM Conference on Security & Privacy in Wireless and Mobile Networks.
[45]
Haitao Xu, Daiping Liu, Haining Wang, and Angelos Stavrou. 2015. E-commerce Reputation Manipulation: The Emergence of Reputation-Escalation-As-A-Service. In International Conference on World Wide Web. ACM.
[46]
Hao Yang, Xiulin Ma, Kun Du, Zhou Li, Haixin Duan, Xiaodong Su, Guang Liu, Zhifeng Geng, and Jianping Wu. 2017. How to Learn Klingon without a Dictionary: Detection and Measurement of Black Keywords used by the Underground Economy. In Proceedings of the IEEE Symposium on Security and Privacy.
[47]
zdnet.com. 2016. GitHub warns. https://www.zdnet.com/article/github-warns-some-accounts- compromised-after-reused-password-attack/.
[48]
zdnet.com. 2018. GitHub says bug exposed some plaintext passwords | ZDNet. https://www.zdnet.com/article/github-says-bug-exposed-account-passwords/.
[49]
Xianchao Zhang, Shaoping Zhu, and Wenxin Liang. 2012. Detecting Spam and Promoting Campaigns in the Twitter Social Network. In Proceedings of the IEEE 12th International Conference on Data Mining.
[50]
zhaopin.com. 2019. Zhaopin. https://www.zhaopin.com/.
[51]
Haizhong Zheng, Minhui Xue, Hao Lu, Shuang Hao, Haojin Zhu, Xiaohui Liang, and Keith Ross. 2017. Smoke Screener or Straight Shooter: Detecting Elite Sybil Attacks in User-Review Social Networks. arXiv preprint arXiv:1709.06916(2017).
[52]
zhihu.com. 2018. China’s mainland GitHub fraud has grown exponentially, behind it... (Translated from Chinese). https://zhuanlan.zhihu.com/p/38791657.

Cited By

View all
  • (2024)Stealthy Backdoor Attack for Code ModelsIEEE Transactions on Software Engineering10.1109/TSE.2024.336166150:4(721-741)Online publication date: Apr-2024
  • (2021)Experiences and insights from using Github Classroom to support Project-Based Courses2021 Third International Workshop on Software Engineering Education for the Next Generation (SEENG)10.1109/SEENG53126.2021.00013(31-35)Online publication date: May-2021
  • (2021)The Wonderless Dataset for Serverless Computing2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR)10.1109/MSR52588.2021.00075(565-569)Online publication date: May-2021

Index Terms

  1. Understanding Promotion-as-a-Service on GitHub
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image ACM Other conferences
          ACSAC '20: Proceedings of the 36th Annual Computer Security Applications Conference
          December 2020
          962 pages
          ISBN:9781450388580
          DOI:10.1145/3427228
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 08 December 2020

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. GitHub
          2. Promoter Detection
          3. Promotion-as-a-Service

          Qualifiers

          • Research-article
          • Research
          • Refereed limited

          Conference

          ACSAC '20

          Acceptance Rates

          Overall Acceptance Rate 104 of 497 submissions, 21%

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)99
          • Downloads (Last 6 weeks)9
          Reflects downloads up to 17 Feb 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)Stealthy Backdoor Attack for Code ModelsIEEE Transactions on Software Engineering10.1109/TSE.2024.336166150:4(721-741)Online publication date: Apr-2024
          • (2021)Experiences and insights from using Github Classroom to support Project-Based Courses2021 Third International Workshop on Software Engineering Education for the Next Generation (SEENG)10.1109/SEENG53126.2021.00013(31-35)Online publication date: May-2021
          • (2021)The Wonderless Dataset for Serverless Computing2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR)10.1109/MSR52588.2021.00075(565-569)Online publication date: May-2021

          View Options

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format.

          HTML Format

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media