skip to main content
10.1145/3236024.3264842acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Towards quantifying the development value of code contributions

Published:26 October 2018Publication History

ABSTRACT

Quantifying the value of developers’ code contributions to a software project requires more than simply counting lines of code or commits. We define the development value of code as a combination of its structural value (the effect of code reuse) and its non-structural value (the impact on development). We propose techniques to automatically calculate both components of development value and combine them using Learning to Rank. Our preliminary empirical study shows that our analysis yields richer results than those obtained by human assessment or simple counting methods and demonstrates the potential of our approach.

References

  1. Jong-hoon An, Avik Chaudhuri, and Jeffrey S. Foster. 2009. Static Typing for Ruby on Rails. In Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering (ASE ’09). IEEE Computer Society, 590–594. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Jong-hoon (David) An, Avik Chaudhuri, Jeffrey S. Foster, and Michael Hicks. 2011. Dynamic Inference of Static Types for Ruby. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’11). 459–472. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Stefan Biffl, Aybuke Aurum, Barry Boehm, Hakan Erdogmus, and Paul Grünbacher (Eds.). 2009. Value-Based Software Engineering. Springer.Google ScholarGoogle Scholar
  4. Christian Bird, Nachiappan Nagappan, Harald Gall, Brendan Murphy, and Premkumar Devanbu. 2009. Putting It All Together: Using Socio-technical Networks to Predict Failures. In Proceedings of the 20th International Symposium on Software Reliability Engineering (ISSRE ’09). IEEE, 109–119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. Boehm and Li Guo Huang. 2003. Value-based software engineering: a case study. IEEE Software 36, 3 (Mar 2003), 33–41.Google ScholarGoogle Scholar
  6. Jeff Bonwick. 1994. The Slab Allocator: An Object-caching Kernel Memory Allocator. In Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1 (USTC’94). USENIX Association, Berkeley, CA, USA, 6–6. http://dl.acm.org/citation.cfm?id=1267257.1267263 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Sergey Brin and Lawrence Page. 1998. The Anatomy of a Large-scale Hypertextual Web Search Engine. In Proceedings of the Seventh International Conference on World Wide Web 7 (WWW-7). 107–117. http://dl.acm.org/citation.cfm?id=297805.297827 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Yunbo Cao, Jun Xu, Tie-Yan Liu, Hang Li, Yalou Huang, and Hsiao-Wuen Hon. 2006. Adapting Ranking SVM to Document Retrieval. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’06). ACM, New York, NY, USA, 186–193. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bradford Clark, Sunita Devnani-Chulani, and Barry Boehm. 1998. Calibrating the COCOMO II Post-architecture Model. In Proceedings of the 20th International Conference on Software Engineering (ICSE ’98). IEEE Computer Society, 477–480. http://dl.acm.org/citation.cfm?id=302163.302218 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Laura Dabbish, Colleen Stuart, Jason Tsay, and Jim Herbsleb. 2012. Social Coding in GitHub: Transparency and Collaboration in an Open Software Repository. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work (CSCW ’12). 1277–1286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Yoav Freund, Raj Iyer, Robert E. Schapire, and Yoram Singer. 2003. An Efficient Boosting Algorithm for Combining Preferences. J. Mach. Learn. Res. 4 (Dec. 2003), 933–969. http://dl.acm.org/citation.cfm?id=945365.964285 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Thomas Fritz, Gail C. Murphy, Emerson Murphy-Hill, Jingwen Ou, and Emily Hill. 2014. Degree-of-knowledge: Modeling a Developer’s Knowledge of Code. ACM Trans. Softw. Eng. Methodol. 23, 2, Article 14 (April 2014), 42 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. GitHub. 2018. Viewing contribution activity in a repository. https://help.github. com/articles/viewing-contribution-activity-in-a-repository/.Google ScholarGoogle Scholar
  14. Barney G Glaser, Anselm L Strauss, and Elizabeth Strutzel. 1968. The discovery of grounded theory; strategies for qualitative research. Nursing research 17, 4 (1968), 364.Google ScholarGoogle Scholar
  15. David F. Gleich. 2015. PageRank Beyond the Web. SIAM Rev. 57, 3 (Aug. 2015), 321âĂŞ363.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Ralf Herbrich, Thore Graepel, and Klaus Obermayer. 2000. Large margin rank boundaries for ordinal regression. Advances in Large Margin Classifiers 88 (01 2000), 115–132.Google ScholarGoogle Scholar
  17. ISO/IEC. 2006. ISO/IEC 14764:2006 Software Engineering – Software Life Cycle Processes – Maintenance. https://www.iso.org/standard/39064.html.Google ScholarGoogle Scholar
  18. Mitchell Joblin, Wolfgang Mauerer, Sven Apel, Janet Siegmund, and Dirk Riehle. 2015. From Developer Networks to Verified Communities: A Fine-grained Approach. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE ’15). IEEE Press, 563–573. http://dl.acm.org/citation.cfm?id= 2818754.2818824 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Jorgensen, B. Boehm, and S. Rifkin. 2009. Software Development Effort Estimation: Formal Models or Expert Judgment? IEEE Software 26, 2 (March 2009), 14–19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov. 2016. Bag of Tricks for Efficient Text Classification. CoRR abs/1607.01759 (2016). http: //arxiv.org/abs/1607.01759Google ScholarGoogle Scholar
  21. Brian W. Kernighan. 1988. The C Programming Language (2nd ed.). Prentice Hall Professional Technical Reference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP ’14). Association for Computational Linguistics (ACL), Doha, Qatar, 1746–1751.Google ScholarGoogle ScholarCross RefCross Ref
  23. B. P. Lientz, E. B. Swanson, and G. E. Tompkins. 1978. Characteristics of Application Software Maintenance. Commun. ACM 21, 6 (June 1978), 466–471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Luis Lopez-Fernandez, Gregorio Robles, and Jesus M Gonzalez-Barahona. 2004. Applying social network analysis to the information in CVS repositories. In Proceedings of the 1st International Workshop on Mining Software Repositories (MSR ’04). 101–105.Google ScholarGoogle ScholarCross RefCross Ref
  25. Jonathan I. Maletic and Michael L. Collard. 2015. Exploration, Analysis, and Manipulation of Source Code Using srcML. In Proceedings of the 37th International Conference on Software Engineering - Volume 2 (ICSE ’15). IEEE Press, 951–952. http://dl.acm.org/citation.cfm?id=2819009.2819225 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jennifer Marlow, Laura Dabbish, and Jim Herbsleb. 2013. Impression Formation in Online Peer Production: Activity Traces and Personal Profiles in Github. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work (CSCW ’13). ACM, 117–128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Matt Mackall. 2005. slob: introduce the SLOB allocator. https://lwn.net/Articles/ 157944/Google ScholarGoogle Scholar
  28. Thilo Mende and Rainer Koschke. 2010. Effort-Aware Defect Prediction Models. In Proceedings of the 2010 14th European Conference on Software Maintenance and Reengineering (CSMR ’10). IEEE Computer Society, Washington, DC, USA, 107–116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Andrew Meneely, Laurie Williams, Will Snipes, and Jason Osborne. 2008. Predicting Failures with Developer Networks and Social Network Analysis. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT ’08/FSE-16). 13–23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. 1453106Google ScholarGoogle Scholar
  31. Xiaozhu Meng, Barton P. Miller, William R. Williams, and Andrew R. Bernat. 2013. Mining Software Repositories for Accurate Authorship. In Proceedings of the 2013 IEEE International Conference on Software Maintenance (ICSM ’13). 250–259. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Ayse Tosun Misirli, Emad Shihab, and Yasukata Kamei. 2016. Studying high impact fix-inducing changes. Empirical Software Engineering 21, 2 (01 Apr 2016), 605–641. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Ivan Mistrik, Rami Bahsoon, Rick Kazman, and Yuanyuan Zhang (Eds.). 2014. Economics-Driven Software Architecture. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Audris Mockus and James D. Herbsleb. 2002. Expertise Browser: A Quantitative Approach to Identifying Expertise. In Proceedings of the 24th International Conference on Software Engineering (ICSE ’02). ACM, 503–512. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Martin Pinzger, Nachiappan Nagappan, and Brendan Murphy. 2008. Can Developer-module Networks Predict Failures?. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT ’08/FSE-16). 2–12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. W. Scacchi. 2002. Understanding the requirements for developing open source software systems. IEE Proceedings - Software 149, 1 (Feb 2002), 24–39.Google ScholarGoogle ScholarCross RefCross Ref
  37. Emad Shihab, Yasutaka Kamei, Bram Adams, and Ahmed E. Hassan. 2013. Is Lines of Code a Good Measure of Effort in Effort-aware Models? Inf. Softw. Technol. 55, 11 (Nov. 2013), 1981–1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Robert Speer and Joanna Lowry-Duda. 2017. ConceptNet at SemEval-2017 Task 2: Extending Word Embeddings with Multilingual Relational Knowledge. In Proceedings of the 11th International Workshop on Semantic Evaluations (SemEval- 2017). Association for Computational Linguistics (ACL), Vancouver, Canada, 85âĂŞ–89.Google ScholarGoogle ScholarCross RefCross Ref
  39. Jason Tsay, Laura Dabbish, and James Herbsleb. 2014. Influence of Social and Technical Factors for Evaluating Contribution in GitHub. In Proceedings of the 36th International Conference on Software Engineering (ICSE ’14). ACM, 356–366. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Jason Tsay, Laura Dabbish, and James Herbsleb. 2014. Let’s Talk About It: Evaluating Contributions Through Discussion in GitHub. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE ’14). 144–154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. H. Wu, L. Shi, C. Chen, Q. Wang, and B. Boehm. 2016. Maintenance Effort Estimation for Open Source Software: A Systematic Literature Review. In 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME). 32–43.Google ScholarGoogle Scholar

Index Terms

  1. Towards quantifying the development value of code contributions

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ESEC/FSE 2018: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
        October 2018
        987 pages
        ISBN:9781450355735
        DOI:10.1145/3236024

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 26 October 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate112of543submissions,21%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader