ABSTRACT
Quantifying the value of developers’ code contributions to a software project requires more than simply counting lines of code or commits. We define the development value of code as a combination of its structural value (the effect of code reuse) and its non-structural value (the impact on development). We propose techniques to automatically calculate both components of development value and combine them using Learning to Rank. Our preliminary empirical study shows that our analysis yields richer results than those obtained by human assessment or simple counting methods and demonstrates the potential of our approach.
- Jong-hoon An, Avik Chaudhuri, and Jeffrey S. Foster. 2009. Static Typing for Ruby on Rails. In Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering (ASE ’09). IEEE Computer Society, 590–594. Google ScholarDigital Library
- Jong-hoon (David) An, Avik Chaudhuri, Jeffrey S. Foster, and Michael Hicks. 2011. Dynamic Inference of Static Types for Ruby. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’11). 459–472. Google ScholarDigital Library
- Stefan Biffl, Aybuke Aurum, Barry Boehm, Hakan Erdogmus, and Paul Grünbacher (Eds.). 2009. Value-Based Software Engineering. Springer.Google Scholar
- Christian Bird, Nachiappan Nagappan, Harald Gall, Brendan Murphy, and Premkumar Devanbu. 2009. Putting It All Together: Using Socio-technical Networks to Predict Failures. In Proceedings of the 20th International Symposium on Software Reliability Engineering (ISSRE ’09). IEEE, 109–119. Google ScholarDigital Library
- B. Boehm and Li Guo Huang. 2003. Value-based software engineering: a case study. IEEE Software 36, 3 (Mar 2003), 33–41.Google Scholar
- Jeff Bonwick. 1994. The Slab Allocator: An Object-caching Kernel Memory Allocator. In Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1 (USTC’94). USENIX Association, Berkeley, CA, USA, 6–6. http://dl.acm.org/citation.cfm?id=1267257.1267263 Google ScholarDigital Library
- Sergey Brin and Lawrence Page. 1998. The Anatomy of a Large-scale Hypertextual Web Search Engine. In Proceedings of the Seventh International Conference on World Wide Web 7 (WWW-7). 107–117. http://dl.acm.org/citation.cfm?id=297805.297827 Google ScholarDigital Library
- Yunbo Cao, Jun Xu, Tie-Yan Liu, Hang Li, Yalou Huang, and Hsiao-Wuen Hon. 2006. Adapting Ranking SVM to Document Retrieval. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’06). ACM, New York, NY, USA, 186–193. Google ScholarDigital Library
- Bradford Clark, Sunita Devnani-Chulani, and Barry Boehm. 1998. Calibrating the COCOMO II Post-architecture Model. In Proceedings of the 20th International Conference on Software Engineering (ICSE ’98). IEEE Computer Society, 477–480. http://dl.acm.org/citation.cfm?id=302163.302218 Google ScholarDigital Library
- Laura Dabbish, Colleen Stuart, Jason Tsay, and Jim Herbsleb. 2012. Social Coding in GitHub: Transparency and Collaboration in an Open Software Repository. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work (CSCW ’12). 1277–1286. Google ScholarDigital Library
- Yoav Freund, Raj Iyer, Robert E. Schapire, and Yoram Singer. 2003. An Efficient Boosting Algorithm for Combining Preferences. J. Mach. Learn. Res. 4 (Dec. 2003), 933–969. http://dl.acm.org/citation.cfm?id=945365.964285 Google ScholarDigital Library
- Thomas Fritz, Gail C. Murphy, Emerson Murphy-Hill, Jingwen Ou, and Emily Hill. 2014. Degree-of-knowledge: Modeling a Developer’s Knowledge of Code. ACM Trans. Softw. Eng. Methodol. 23, 2, Article 14 (April 2014), 42 pages. Google ScholarDigital Library
- GitHub. 2018. Viewing contribution activity in a repository. https://help.github. com/articles/viewing-contribution-activity-in-a-repository/.Google Scholar
- Barney G Glaser, Anselm L Strauss, and Elizabeth Strutzel. 1968. The discovery of grounded theory; strategies for qualitative research. Nursing research 17, 4 (1968), 364.Google Scholar
- David F. Gleich. 2015. PageRank Beyond the Web. SIAM Rev. 57, 3 (Aug. 2015), 321âĂŞ363.Google ScholarDigital Library
- Ralf Herbrich, Thore Graepel, and Klaus Obermayer. 2000. Large margin rank boundaries for ordinal regression. Advances in Large Margin Classifiers 88 (01 2000), 115–132.Google Scholar
- ISO/IEC. 2006. ISO/IEC 14764:2006 Software Engineering – Software Life Cycle Processes – Maintenance. https://www.iso.org/standard/39064.html.Google Scholar
- Mitchell Joblin, Wolfgang Mauerer, Sven Apel, Janet Siegmund, and Dirk Riehle. 2015. From Developer Networks to Verified Communities: A Fine-grained Approach. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE ’15). IEEE Press, 563–573. http://dl.acm.org/citation.cfm?id= 2818754.2818824 Google ScholarDigital Library
- M. Jorgensen, B. Boehm, and S. Rifkin. 2009. Software Development Effort Estimation: Formal Models or Expert Judgment? IEEE Software 26, 2 (March 2009), 14–19. Google ScholarDigital Library
- Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov. 2016. Bag of Tricks for Efficient Text Classification. CoRR abs/1607.01759 (2016). http: //arxiv.org/abs/1607.01759Google Scholar
- Brian W. Kernighan. 1988. The C Programming Language (2nd ed.). Prentice Hall Professional Technical Reference. Google ScholarDigital Library
- Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP ’14). Association for Computational Linguistics (ACL), Doha, Qatar, 1746–1751.Google ScholarCross Ref
- B. P. Lientz, E. B. Swanson, and G. E. Tompkins. 1978. Characteristics of Application Software Maintenance. Commun. ACM 21, 6 (June 1978), 466–471. Google ScholarDigital Library
- Luis Lopez-Fernandez, Gregorio Robles, and Jesus M Gonzalez-Barahona. 2004. Applying social network analysis to the information in CVS repositories. In Proceedings of the 1st International Workshop on Mining Software Repositories (MSR ’04). 101–105.Google ScholarCross Ref
- Jonathan I. Maletic and Michael L. Collard. 2015. Exploration, Analysis, and Manipulation of Source Code Using srcML. In Proceedings of the 37th International Conference on Software Engineering - Volume 2 (ICSE ’15). IEEE Press, 951–952. http://dl.acm.org/citation.cfm?id=2819009.2819225 Google ScholarDigital Library
- Jennifer Marlow, Laura Dabbish, and Jim Herbsleb. 2013. Impression Formation in Online Peer Production: Activity Traces and Personal Profiles in Github. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work (CSCW ’13). ACM, 117–128. Google ScholarDigital Library
- Matt Mackall. 2005. slob: introduce the SLOB allocator. https://lwn.net/Articles/ 157944/Google Scholar
- Thilo Mende and Rainer Koschke. 2010. Effort-Aware Defect Prediction Models. In Proceedings of the 2010 14th European Conference on Software Maintenance and Reengineering (CSMR ’10). IEEE Computer Society, Washington, DC, USA, 107–116. Google ScholarDigital Library
- Andrew Meneely, Laurie Williams, Will Snipes, and Jason Osborne. 2008. Predicting Failures with Developer Networks and Social Network Analysis. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT ’08/FSE-16). 13–23. Google ScholarDigital Library
- 1453106Google Scholar
- Xiaozhu Meng, Barton P. Miller, William R. Williams, and Andrew R. Bernat. 2013. Mining Software Repositories for Accurate Authorship. In Proceedings of the 2013 IEEE International Conference on Software Maintenance (ICSM ’13). 250–259. Google ScholarDigital Library
- Ayse Tosun Misirli, Emad Shihab, and Yasukata Kamei. 2016. Studying high impact fix-inducing changes. Empirical Software Engineering 21, 2 (01 Apr 2016), 605–641. Google ScholarDigital Library
- Ivan Mistrik, Rami Bahsoon, Rick Kazman, and Yuanyuan Zhang (Eds.). 2014. Economics-Driven Software Architecture. Morgan Kaufmann. Google ScholarDigital Library
- Audris Mockus and James D. Herbsleb. 2002. Expertise Browser: A Quantitative Approach to Identifying Expertise. In Proceedings of the 24th International Conference on Software Engineering (ICSE ’02). ACM, 503–512. Google ScholarDigital Library
- Martin Pinzger, Nachiappan Nagappan, and Brendan Murphy. 2008. Can Developer-module Networks Predict Failures?. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT ’08/FSE-16). 2–12. Google ScholarDigital Library
- W. Scacchi. 2002. Understanding the requirements for developing open source software systems. IEE Proceedings - Software 149, 1 (Feb 2002), 24–39.Google ScholarCross Ref
- Emad Shihab, Yasutaka Kamei, Bram Adams, and Ahmed E. Hassan. 2013. Is Lines of Code a Good Measure of Effort in Effort-aware Models? Inf. Softw. Technol. 55, 11 (Nov. 2013), 1981–1993. Google ScholarDigital Library
- Robert Speer and Joanna Lowry-Duda. 2017. ConceptNet at SemEval-2017 Task 2: Extending Word Embeddings with Multilingual Relational Knowledge. In Proceedings of the 11th International Workshop on Semantic Evaluations (SemEval- 2017). Association for Computational Linguistics (ACL), Vancouver, Canada, 85âĂŞ–89.Google ScholarCross Ref
- Jason Tsay, Laura Dabbish, and James Herbsleb. 2014. Influence of Social and Technical Factors for Evaluating Contribution in GitHub. In Proceedings of the 36th International Conference on Software Engineering (ICSE ’14). ACM, 356–366. Google ScholarDigital Library
- Jason Tsay, Laura Dabbish, and James Herbsleb. 2014. Let’s Talk About It: Evaluating Contributions Through Discussion in GitHub. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE ’14). 144–154. Google ScholarDigital Library
- H. Wu, L. Shi, C. Chen, Q. Wang, and B. Boehm. 2016. Maintenance Effort Estimation for Open Source Software: A Systematic Literature Review. In 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME). 32–43.Google Scholar
Index Terms
- Towards quantifying the development value of code contributions
Recommendations
Learning to rank code examples for code search engines
Source code examples are used by developers to implement unfamiliar tasks by learning from existing solutions. To better support developers in finding existing solutions, code search engines are designed to locate and rank code examples relevant to user'...
Towards exploring the code reuse from stack overflow during software development
ICPC '22: Proceedings of the 30th IEEE/ACM International Conference on Program ComprehensionAs one of the most well-known programmer Q&A websites, Stack Overflow (i.e., SO) is serving tens of thousands of developers every day. Previous work has shown that many developers reuse the code snippets on SO when they find an answer (from SO) that ...
A study of value in agile software development organizations
Examines how Value is interpreted in 14 agile software development organizations.Data is collected from 13 project managers and 10 product owners.Sixteen categories of Value Aspects are identified and prioritized.The most important Value Aspect is ...
Comments