research-article

Towards quantifying the development value of code contributions

Authors:
Jinglei Ren

Microsoft Research, China

Microsoft Research, China
View Profile

,
Hezheng Yin

University of California at Berkeley, USA

University of California at Berkeley, USA
View Profile

,
Qingda Hu

Tsinghua University, China

Tsinghua University, China
View Profile

,
Armando Fox

University of California at Berkeley, USA

University of California at Berkeley, USA
View Profile

,
Wojciech Koszek

FreeBSD Project, USA

FreeBSD Project, USA
View Profile

ESEC/FSE 2018: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software EngineeringOctober 2018Pages 775–779https://doi.org/10.1145/3236024.3264842

Published:26 October 2018Publication History

ESEC/FSE 2018: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Pages 775–779

ABSTRACT

Quantifying the value of developers’ code contributions to a software project requires more than simply counting lines of code or commits. We define the development value of code as a combination of its structural value (the effect of code reuse) and its non-structural value (the impact on development). We propose techniques to automatically calculate both components of development value and combine them using Learning to Rank. Our preliminary empirical study shows that our analysis yields richer results than those obtained by human assessment or simple counting methods and demonstrates the potential of our approach.

References

Jong-hoon An, Avik Chaudhuri, and Jeffrey S. Foster. 2009. Static Typing for Ruby on Rails. In Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering (ASE ’09). IEEE Computer Society, 590–594. Google ScholarDigital Library
Jong-hoon (David) An, Avik Chaudhuri, Jeffrey S. Foster, and Michael Hicks. 2011. Dynamic Inference of Static Types for Ruby. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’11). 459–472. Google ScholarDigital Library
Stefan Biffl, Aybuke Aurum, Barry Boehm, Hakan Erdogmus, and Paul Grünbacher (Eds.). 2009. Value-Based Software Engineering. Springer.Google Scholar
Christian Bird, Nachiappan Nagappan, Harald Gall, Brendan Murphy, and Premkumar Devanbu. 2009. Putting It All Together: Using Socio-technical Networks to Predict Failures. In Proceedings of the 20th International Symposium on Software Reliability Engineering (ISSRE ’09). IEEE, 109–119. Google ScholarDigital Library
B. Boehm and Li Guo Huang. 2003. Value-based software engineering: a case study. IEEE Software 36, 3 (Mar 2003), 33–41.Google Scholar
Jeff Bonwick. 1994. The Slab Allocator: An Object-caching Kernel Memory Allocator. In Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1 (USTC’94). USENIX Association, Berkeley, CA, USA, 6–6. http://dl.acm.org/citation.cfm?id=1267257.1267263 Google ScholarDigital Library
Sergey Brin and Lawrence Page. 1998. The Anatomy of a Large-scale Hypertextual Web Search Engine. In Proceedings of the Seventh International Conference on World Wide Web 7 (WWW-7). 107–117. http://dl.acm.org/citation.cfm?id=297805.297827 Google ScholarDigital Library
Yunbo Cao, Jun Xu, Tie-Yan Liu, Hang Li, Yalou Huang, and Hsiao-Wuen Hon. 2006. Adapting Ranking SVM to Document Retrieval. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’06). ACM, New York, NY, USA, 186–193. Google ScholarDigital Library
Bradford Clark, Sunita Devnani-Chulani, and Barry Boehm. 1998. Calibrating the COCOMO II Post-architecture Model. In Proceedings of the 20th International Conference on Software Engineering (ICSE ’98). IEEE Computer Society, 477–480. http://dl.acm.org/citation.cfm?id=302163.302218 Google ScholarDigital Library
Laura Dabbish, Colleen Stuart, Jason Tsay, and Jim Herbsleb. 2012. Social Coding in GitHub: Transparency and Collaboration in an Open Software Repository. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work (CSCW ’12). 1277–1286. Google ScholarDigital Library
Yoav Freund, Raj Iyer, Robert E. Schapire, and Yoram Singer. 2003. An Efficient Boosting Algorithm for Combining Preferences. J. Mach. Learn. Res. 4 (Dec. 2003), 933–969. http://dl.acm.org/citation.cfm?id=945365.964285 Google ScholarDigital Library
Thomas Fritz, Gail C. Murphy, Emerson Murphy-Hill, Jingwen Ou, and Emily Hill. 2014. Degree-of-knowledge: Modeling a Developer’s Knowledge of Code. ACM Trans. Softw. Eng. Methodol. 23, 2, Article 14 (April 2014), 42 pages. Google ScholarDigital Library
GitHub. 2018. Viewing contribution activity in a repository. https://help.github. com/articles/viewing-contribution-activity-in-a-repository/.Google Scholar
Barney G Glaser, Anselm L Strauss, and Elizabeth Strutzel. 1968. The discovery of grounded theory; strategies for qualitative research. Nursing research 17, 4 (1968), 364.Google Scholar
David F. Gleich. 2015. PageRank Beyond the Web. SIAM Rev. 57, 3 (Aug. 2015), 321âĂŞ363.Google ScholarDigital Library
Ralf Herbrich, Thore Graepel, and Klaus Obermayer. 2000. Large margin rank boundaries for ordinal regression. Advances in Large Margin Classifiers 88 (01 2000), 115–132.Google Scholar
ISO/IEC. 2006. ISO/IEC 14764:2006 Software Engineering – Software Life Cycle Processes – Maintenance. https://www.iso.org/standard/39064.html.Google Scholar
Mitchell Joblin, Wolfgang Mauerer, Sven Apel, Janet Siegmund, and Dirk Riehle. 2015. From Developer Networks to Verified Communities: A Fine-grained Approach. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE ’15). IEEE Press, 563–573. http://dl.acm.org/citation.cfm?id= 2818754.2818824 Google ScholarDigital Library
M. Jorgensen, B. Boehm, and S. Rifkin. 2009. Software Development Effort Estimation: Formal Models or Expert Judgment? IEEE Software 26, 2 (March 2009), 14–19. Google ScholarDigital Library
Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov. 2016. Bag of Tricks for Efficient Text Classification. CoRR abs/1607.01759 (2016). http: //arxiv.org/abs/1607.01759Google Scholar
Brian W. Kernighan. 1988. The C Programming Language (2nd ed.). Prentice Hall Professional Technical Reference. Google ScholarDigital Library
Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP ’14). Association for Computational Linguistics (ACL), Doha, Qatar, 1746–1751.Google ScholarCross Ref
B. P. Lientz, E. B. Swanson, and G. E. Tompkins. 1978. Characteristics of Application Software Maintenance. Commun. ACM 21, 6 (June 1978), 466–471. Google ScholarDigital Library
Luis Lopez-Fernandez, Gregorio Robles, and Jesus M Gonzalez-Barahona. 2004. Applying social network analysis to the information in CVS repositories. In Proceedings of the 1st International Workshop on Mining Software Repositories (MSR ’04). 101–105.Google ScholarCross Ref
Jonathan I. Maletic and Michael L. Collard. 2015. Exploration, Analysis, and Manipulation of Source Code Using srcML. In Proceedings of the 37th International Conference on Software Engineering - Volume 2 (ICSE ’15). IEEE Press, 951–952. http://dl.acm.org/citation.cfm?id=2819009.2819225 Google ScholarDigital Library
Jennifer Marlow, Laura Dabbish, and Jim Herbsleb. 2013. Impression Formation in Online Peer Production: Activity Traces and Personal Profiles in Github. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work (CSCW ’13). ACM, 117–128. Google ScholarDigital Library
Matt Mackall. 2005. slob: introduce the SLOB allocator. https://lwn.net/Articles/ 157944/Google Scholar
Thilo Mende and Rainer Koschke. 2010. Effort-Aware Defect Prediction Models. In Proceedings of the 2010 14th European Conference on Software Maintenance and Reengineering (CSMR ’10). IEEE Computer Society, Washington, DC, USA, 107–116. Google ScholarDigital Library
Andrew Meneely, Laurie Williams, Will Snipes, and Jason Osborne. 2008. Predicting Failures with Developer Networks and Social Network Analysis. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT ’08/FSE-16). 13–23. Google ScholarDigital Library
1453106Google Scholar
Xiaozhu Meng, Barton P. Miller, William R. Williams, and Andrew R. Bernat. 2013. Mining Software Repositories for Accurate Authorship. In Proceedings of the 2013 IEEE International Conference on Software Maintenance (ICSM ’13). 250–259. Google ScholarDigital Library
Ayse Tosun Misirli, Emad Shihab, and Yasukata Kamei. 2016. Studying high impact fix-inducing changes. Empirical Software Engineering 21, 2 (01 Apr 2016), 605–641. Google ScholarDigital Library
Ivan Mistrik, Rami Bahsoon, Rick Kazman, and Yuanyuan Zhang (Eds.). 2014. Economics-Driven Software Architecture. Morgan Kaufmann. Google ScholarDigital Library
Audris Mockus and James D. Herbsleb. 2002. Expertise Browser: A Quantitative Approach to Identifying Expertise. In Proceedings of the 24th International Conference on Software Engineering (ICSE ’02). ACM, 503–512. Google ScholarDigital Library
Martin Pinzger, Nachiappan Nagappan, and Brendan Murphy. 2008. Can Developer-module Networks Predict Failures?. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT ’08/FSE-16). 2–12. Google ScholarDigital Library
W. Scacchi. 2002. Understanding the requirements for developing open source software systems. IEE Proceedings - Software 149, 1 (Feb 2002), 24–39.Google ScholarCross Ref
Emad Shihab, Yasutaka Kamei, Bram Adams, and Ahmed E. Hassan. 2013. Is Lines of Code a Good Measure of Effort in Effort-aware Models? Inf. Softw. Technol. 55, 11 (Nov. 2013), 1981–1993. Google ScholarDigital Library
Robert Speer and Joanna Lowry-Duda. 2017. ConceptNet at SemEval-2017 Task 2: Extending Word Embeddings with Multilingual Relational Knowledge. In Proceedings of the 11th International Workshop on Semantic Evaluations (SemEval- 2017). Association for Computational Linguistics (ACL), Vancouver, Canada, 85âĂŞ–89.Google ScholarCross Ref
Jason Tsay, Laura Dabbish, and James Herbsleb. 2014. Influence of Social and Technical Factors for Evaluating Contribution in GitHub. In Proceedings of the 36th International Conference on Software Engineering (ICSE ’14). ACM, 356–366. Google ScholarDigital Library
Jason Tsay, Laura Dabbish, and James Herbsleb. 2014. Let’s Talk About It: Evaluating Contributions Through Discussion in GitHub. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE ’14). 144–154. Google ScholarDigital Library
H. Wu, L. Shi, C. Chen, Q. Wang, and B. Boehm. 2016. Maintenance Effort Estimation for Open Source Software: A Systematic Literature Review. In 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME). 32–43.Google Scholar

Index Terms

Towards quantifying the development value of code contributions
1. General and reference
  1. Cross-computing tools and techniques
    1. Metrics
2. Software and its engineering
  1. Software creation and management
    1. Software post-development issues

Recommendations

Learning to rank code examples for code search engines

Source code examples are used by developers to implement unfamiliar tasks by learning from existing solutions. To better support developers in finding existing solutions, code search engines are designed to locate and rank code examples relevant to user'...
Read More
Towards exploring the code reuse from stack overflow during software development
ICPC '22: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension

As one of the most well-known programmer Q&A websites, Stack Overflow (i.e., SO) is serving tens of thousands of developers every day. Previous work has shown that many developers reuse the code snippets on SO when they find an answer (from SO) that ...
Read More
A study of value in agile software development organizations

Examines how Value is interpreted in 14 agile software development organizations.Data is collected from 13 project managers and 10 product owners.Sixteen categories of Value Aspects are identified and prioritized.The most important Value Aspect is ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ESEC/FSE 2018: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
October 2018
987 pages
ISBN:9781450355735
DOI:10.1145/3236024
General Chair:
Gary T. Leavens
University of Central Florida, USA
,
Program Chairs:
Alessandro Garcia
PUC-Rio, Brazil
,
Corina S. Păsăreanu
NASA Ames Research Center, USA
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 October 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
call graph
development value
learning to rank
software repository mining
static program analysis
value of code
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate112of543submissions,21%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 247
  Total Downloads
- Downloads (Last 12 months)32
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Towards quantifying the development value of code contributions

ESEC/FSE 2018: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Learning to rank code examples for code search engines

Towards exploring the code reuse from stack overflow during software development

A study of value in agile software development organizations

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Towards quantifying the development value of code contributions

ESEC/FSE 2018: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Learning to rank code examples for code search engines

Towards exploring the code reuse from stack overflow during software development

A study of value in agile software development organizations

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media