skip to main content
10.1145/1882362.1882446acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Validity concerns in software engineering research

Published:07 November 2010Publication History

ABSTRACT

Empirical studies that use software repository artifacts have become popular in the last decade due to the ready availability of open source project archives. In this paper, we survey empirical studies in the last three years of ICSE and FSE proceedings, and categorize these studies in terms of open source projects vs. proprietary source projects and the diversity of subject programs used in these studies. Our survey has shown that almost half (49%) of recent empirical studies used solely open source projects. Existing studies either draw general conclusions from these results or explicitly disclaim any conclusions that can extend beyond specific subject software.

We conclude that researchers in empirical software engineering must consider the external validity concerns that arise from using only several well-known open source software projects, and that discussion of data source selection is an important discussion topic in software engineering research. Furthermore, we propose a community research infrastructure for software repository benchmarks and sharing the empirical analysis results, in order to address external validity concerns and to raise the bar for empirical software engineering research that analyzes software artifacts.

References

  1. Open source software engineering workshop series. In WOSSE, 2001-2005.Google ScholarGoogle Scholar
  2. Working conference on mining software repositories. In MSR, 2004-2010.Google ScholarGoogle Scholar
  3. Working conference on mining software repositories: Mining challenges. In MSR Challenge Track, 2006-2010.Google ScholarGoogle Scholar
  4. Workshop on emerging trends in free/libre/open source software research and development. In FLOSS, 2010.Google ScholarGoogle Scholar
  5. P. J. Ágerfalk, B. Fitzgerald, H. H. Olsson, and E. O. Conchúir. Benefits of global software development: the known and unknown. In ICSP'08: Proceedings of the Software process, 2008 international conference on Making globally distributed software development a success story, pages 1--9, Berlin, Heidelberg, 2008. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Blackburn, R. Garner, C. Hoffmann, A. Khang, K. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Guyer, et al. The DaCapo benchmarks: Java benchmarking development and analysis. In Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications, page 190. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Conway. How do committees invent. Datamation, 14(4):28--31, 1968.Google ScholarGoogle Scholar
  8. K. Crowston and J. Howison. The social structure of free and open source software development. First Monday, 10(2), 2005.Google ScholarGoogle Scholar
  9. M. D'Ambros, M. Lanza, and R. Robbes. An extensive comparison of bug prediction approaches. In Mining Software Repositories (MSR), 2010 7th IEEE Working Conference on, pages 31--41, 2--3 2010.Google ScholarGoogle ScholarCross RefCross Ref
  10. H. Do, S. Elbaum, and G. Rothermel. Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empirical Software Engineering, 10(4):405--435, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. I. Herraiz, D. Izquierdo-Cortazar, F. Rivas-Hernández, J. Gonzalez-Barahona, G. Robles, S. nas Dominguez, C. Garcia-Campos, J. Gato, and L. Tovar. FLOSSMetrics: Free/libre/open source software metrics. In Proceedings of the 13th European Conference on Software Maintenance and Reengineering (CSMR). IEEE Computer Society, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Howison and K. Crowston. The perils and pitfalls of mining SourceForge. Proceedings of the International Workshop on Mining Software Repositories (MSR 2004), pages 7--11, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  13. J. Howison and K. Crowston. FLOSSmole: A collaborative repository for FLOSS research data and analyses. Int. J. of Information Technology and Web Engineering, 1(3):17--26, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  14. J. Howison, K. Inoue, and K. Crowston. Social dynamics of free and open source team communications. International Federation for Information Processing Digital Library, 203(1), 2009.Google ScholarGoogle Scholar
  15. D. Perry, N. Staudenmayer, and L. Votta. People, organizations, and process improvement. IEEE SOFTWARE, pages 36--45, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Robbes, D. Pollet, and M. Lanza. Replaying ide interactions to evaluate and improve change prediction approaches. In Mining Software Repositories (MSR), 2010 7th IEEE Working Conference on, pages 161--170, 2--3 2010.Google ScholarGoogle ScholarCross RefCross Ref
  17. R. Rosenthal and R. Rosnow. Essentials of behavioural research. McGraw, 1991.Google ScholarGoogle Scholar
  18. S. Sim, S. Easterbrook, and R. Holt. Using benchmarking to advance research: A challenge to software engineering. In Proceedings of the 25th International Conference on Software Engineering, page 83. IEEE Computer Society, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. W. F. Tichy. Should computer scientists experiment more? IEEE Computer, 31(5):32--40, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Van Antwerp and G. Madey. Advances in the sourceforge research data archive (srda). In Fourth International Conference on Open Source Systems, IFIP 2.13 (WoPDaSD 2008), Milan, Italy, September 2008.Google ScholarGoogle Scholar
  21. R. Yin. Case study research: Design and methods. Sage Pubns, 2008.Google ScholarGoogle Scholar

Index Terms

  1. Validity concerns in software engineering research

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      FoSER '10: Proceedings of the FSE/SDP workshop on Future of software engineering research
      November 2010
      460 pages
      ISBN:9781450304276
      DOI:10.1145/1882362

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 November 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader