column

R3: repeatability, reproducibility and rigor

Authors:

Tomas KaliberaAuthors Info & Claims

ACM SIGPLAN Notices, Volume 47, Issue 4a

Pages 30 - 36

https://doi.org/10.1145/2442776.2442781

Published: 18 March 2012 Publication History

Abstract

Computer systems research spans sub-disciplines that include embedded systems, programming languages and compilers, networking, and operating systems. Our contention is that a number of structural factors inhibit quality systems research. We highlight some of the factors we have encountered in our own work and observed in published papers and propose solutions that could both increase the productivity of researchers and the quality of their output.

References

[1]

Evaluate collaboratory: Experimental evaluation of software and systems in computer science. http://evaluate.inf.usi.ch/, 2011.

[2]

Reproducible research planet. http://www.rrplanet.com/, 2011.

[3]

B. Alpern, C. R. Attanasio, J. J. Barton, M. G. Burke, P. Cheng, J.-D. Choi, A. Cocchi, S. J. Fink, D. Grove, M. Hind, S. F. Hummel, D. Lieber, V. Litvinov, M. F. Mergen, T. Ngo, J. R. Russell, V. Sarkar, M. J. Serrano, J. C. Shepherd, S. E. Smith, V. C. Sreedhar, H. Srinivasan, and J. Whaley. The Jalapeno virtual machine. IBM Syst. J., 39(1):211--238, 2000.

Digital Library

[4]

K. Baggerly and K. Coombes. Deriving chemo sensitivity from cell lines: Forensic bioinformatics and reproducible research in high-throughput biology. Annals of Applied Statistics, 2008.

[5]

S. Basu and A. DasGupta. Robustness of standard confidence intervals for location parameters under departure from normality. Annals of Statistics, 23(4):1433--1442, 1995.

[6]

S. Blackburn, R. Garner, K. S. McKinley, A. Diwan, S. Z. Guyer, A. Hosking, J. E. B. Moss, D. Stefanović, et al. The DaCapo benchmarks: Java benchmarking development and analysis. In Conference on Object-Oriented Programing, Systems, Languages, and Applications (OOPSLA), 2006.

Digital Library

[7]

S. M. Blackburn, P. Cheng, and K. S. McKinley. Oil and Water? High Performance Garbage Collection in Java with MMTk. In Proceedings of the 26th International Conference on Software Engineering (ICSE), pages 137--146, 2004.

Digital Library

[8]

B. Clark, T. Deshane, E. Dow, S. Evanchik, M. Finlayson, J. Herne, and J. N. Matthews. Xen and the art of repeated research. In USENIX Annual Technical Conference, 2004.

Digital Library

[9]

A. C. Davison and D. V. Hinkley. Bootstrap Methods and Their Applications. Cambridge University Press, 1997.

[10]

E. C. Fieller. Some problems in interval estimation. Journal of the Royal Statistical Society, pages 175--185, 1954.

[11]

A. Gal, B. Eich, M. Shaver, D. Anderson, D. Mandelin, M. R. Haghighat, B. Kaplan, G. Hoare, B. Zbarsky, J. Orendorff, J. Ruderman, E. W. Smith, R. Reitmaier, M. Bebenita, M. Chang, and M. Franz. Trace-based just-in-time type specialization for dynamic languages. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI), pages 465--478, 2009.

Digital Library

[12]

A. Georges, D. Buytaert, and L. Eeckhout. Statistically rigorous Java performance evaluation. In Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA), 2007.

Digital Library

[13]

A. Georges, L. Eeckhout, and D. Buytaert. Java performance evaluation through rigorous replay compilation. In Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA), 2008.

Digital Library

[14]

D. Gu, C. Verbrugge, and E. Gagnon. Code layout as a source of noise in JVM performance. In Component And Middleware Performance Workshop, OOPSLA, 2004.

[15]

M. Hocko and T. Kalibera. Reducing performance nondeterminism via cache-aware page allocation strategies. In Proceedings of the First Joint WOSP/SIPEW International Conference on Performance Engineering, pages 223--234, 2010.

Digital Library

[16]

R. Jain. The Art of Computer Systems Performance Analysis. Wiley, 1991.

[17]

T. Kalibera, L. Bulej, and P. Tuma. Automated detection of performance regressions: The Mono experience. In Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), 2005.

Digital Library

[18]

T. Kalibera, J. Hagelberg, P. Maj, F. Pizlo, B. Titzer, and J. Vitek. A family of real-time Java benchmarks. Concurrency and Computation: Practice and Experience, 2011.

Digital Library

[19]

T. Kalibera and P. Tuma. Precise regression benchmarking with random effects: Improving Mono benchmark results. In Third European Performance Engineering Workshop (EPEW), 2006.

Digital Library

[20]

L. Kirkup. Experimental Methods: An Introduction to the Analysis and Presentation of Data. Wiley, 1994.

[21]

D. J. Lilja. Measuring Computer Performance: A Practitioner's Guide. Cambridge University Press, 2000.

Digital Library

[22]

Y. Luo and L. K. John. Efficiently evaluating speedup using sampled processor simulation. Computer Architecture Letters, 4:22--25, 2004.

Digital Library

[23]

S. E. Maxwell and H. D. Delaney. Designing experiments and analyzing data: a model comparison perspective. Routledge, 2004.

[24]

T. Mytkowicz, A. Diwan, M. Hauswirth, and P. F. Sweeney. Producing wrong data without doing anything obviously wrong! In Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2009.

Digital Library

[25]

D. Rasch and V. Guiard. The robustness of parametric statistical methods. Psychology Science, 46(2):175--208, 2004.

[26]

G. Richards, A. Gal, B. Eich, and J. Vitek. Automated construction of JavaScript benchmarks. In Conference on Object- Oriented Programing, Systems, Languages, and Applications (OOPSLA), 2011.

Digital Library

[27]

G. Richards, S. Lesbrene, B. Burg, and J. Vitek. An analysis of the dynamic behavior of JavaScript programs. In Proceedings of the ACM Programming Language Design and Implementation Conference (PLDI), June 2010.

Digital Library

[28]

B. N. Taylor and C. E. Kuyatt. Guidelines for evaluating and expressing the uncertainty of NIST measurement results. Technical Note 1297, National Institute of Standards and Technology, 1994.

[29]

M. Y. Vardi. Conferences vs. journals in computing research. Commun. ACM, 52(5):5, 2009.

Digital Library

[30]

D. S. Wallach. Rebooting the CS publication process. Commun. ACM, 54(10):32--35, 2011.

Digital Library

[31]

R. Wieringa, H. Heerkens, and B. Regnell. How to read and write a scientific evaluation paper. In Requirements Engineering Conference (RE), 2009.

Digital Library

[32]

E. B.Wilson. An Introduction to Scientific Research. McGraw Hill, 1952.

Cited By

Erazo-Agredo CDiez LAgüero RGarza-Fabre MRubio-Loyola J(2023)Enabling realistic experimentation of disaggregated RAN: Design, implementation, and validation of a modular multi-split eNodeBComputer Networks10.1016/j.comnet.2023.109993235(109993)Online publication date: Nov-2023
https://doi.org/10.1016/j.comnet.2023.109993
ter Beek MFerrari A(2022)Empirical Formal Methods: Guidelines for Performing Empirical Studies on Formal MethodsSoftware10.3390/software10400171:4(381-416)Online publication date: 24-Sep-2022
https://doi.org/10.3390/software1040017
Cabrera ANichita PAfonso SAlmeida FBlanco V(2022)Reliable Energy Measurement on Heterogeneous Systems–on–Chip Based EnvironmentsParallel Processing and Applied Mathematics10.1007/978-3-031-30442-2_28(371-382)Online publication date: 11-Sep-2022
https://dl.acm.org/doi/10.1007/978-3-031-30442-2_28
Show More Cited By

Index Terms

R3: repeatability, reproducibility and rigor

Recommendations

FRIS R3 - CERIF XML in Large Scale Exchange of Research Information

The ambition of FRIS R3 is to take responsibility for a broader regional research information infrastructure that enables the participating Flemish research institutions, and other service users like funders and assessment entities, to uniquely identify ...
Journal self-citation study for semiconductor literature: synchronous and diachronous approach
Special issue: Informetrics

The present study investigates the self-citations of the most productive semiconductor journals by synchronous (self-citing rate) and diachronous (self-cited rate) approaches. Journal's productivity of 100 most productive semiconductor journals was ...
Team size and retracted citations reveal the patterns of retractions from 1981 to 2020
Abstract
The growth of the retraction databases reveals the disturbing trend in science and also the rising trend of citations of retracted papers is a serious concern. The objective of the study is to investigate the patterns of retractions through the ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices

ACM SIGPLAN Notices Volume 47, Issue 4a

Supplemental issue

April 2012

85 pages

ISSN:0362-1340

EISSN:1558-1160

DOI:10.1145/2442776

Issue’s Table of Contents

Copyright © 2012 Authors.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 March 2012

Published in SIGPLAN Volume 47, Issue 4a

Check for updates

Qualifiers

Column

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
334
Total Downloads

Downloads (Last 12 months)18
Downloads (Last 6 weeks)4

Reflects downloads up to 14 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Erazo-Agredo CDiez LAgüero RGarza-Fabre MRubio-Loyola J(2023)Enabling realistic experimentation of disaggregated RAN: Design, implementation, and validation of a modular multi-split eNodeBComputer Networks10.1016/j.comnet.2023.109993235(109993)Online publication date: Nov-2023
https://doi.org/10.1016/j.comnet.2023.109993
ter Beek MFerrari A(2022)Empirical Formal Methods: Guidelines for Performing Empirical Studies on Formal MethodsSoftware10.3390/software10400171:4(381-416)Online publication date: 24-Sep-2022
https://doi.org/10.3390/software1040017
Cabrera ANichita PAfonso SAlmeida FBlanco V(2022)Reliable Energy Measurement on Heterogeneous Systems–on–Chip Based EnvironmentsParallel Processing and Applied Mathematics10.1007/978-3-031-30442-2_28(371-382)Online publication date: 11-Sep-2022
https://dl.acm.org/doi/10.1007/978-3-031-30442-2_28
Papadopoulos AVersluis LBauer AHerbst NKistowski JAli-Eldin AAbad CAmaral JTuma PIosup A(2021)Methodological Principles for Reproducible Performance Evaluation in Cloud ComputingIEEE Transactions on Software Engineering10.1109/TSE.2019.292790847:8(1528-1543)Online publication date: 1-Aug-2021
https://doi.org/10.1109/TSE.2019.2927908
Afonso SAlmeida F(2020)Rancid: Reliable Benchmarking on Android PlatformsIEEE Access10.1109/ACCESS.2020.30145338(143342-143358)Online publication date: 2020
https://doi.org/10.1109/ACCESS.2020.3014533
Mascagni MPouchard LBaldwin SElsethagen TJha SRaju BStephan ETang LVan Dam K(2019)Computational reproducibility of scientific workflows at extreme scalesInternational Journal of High Performance Computing Applications10.1177/109434201983912433:5(763-776)Online publication date: 1-Sep-2019
https://dl.acm.org/doi/10.1177/1094342019839124
Harkes Dvan Chastelet EVisser EPearce DMayerhofer TSteimann F(2018)Migrating business logic to an incremental computing DSL: a case studyProceedings of the 11th ACM SIGPLAN International Conference on Software Language Engineering10.1145/3276604.3276617(83-96)Online publication date: 24-Oct-2018
https://dl.acm.org/doi/10.1145/3276604.3276617
Iosup AUta AVersluis LAndreadis Gvan Eyk EHegeman TTalluri Svan Beek VToader L(2018)Massivizing Computer Systems: A Vision to Understand, Design, and Engineer Computer Ecosystems Through and Beyond Modern Distributed Systems2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS.2018.00122(1224-1237)Online publication date: Jul-2018
https://doi.org/10.1109/ICDCS.2018.00122
Flittner MBauer RRizk AGeißler SZinner TZitterbart M(2017)Taming the Complexity of Artifact ReproducibilityProceedings of the Reproducibility Workshop10.1145/3097766.3097770(14-16)Online publication date: 11-Aug-2017
https://dl.acm.org/doi/10.1145/3097766.3097770
Zhang BKřikava FRouvoy RSeinturier LGarlan DNuseibeh B(2017)Hadoop-benchmarkProceedings of the 12th International Symposium on Software Engineering for Adaptive and Self-Managing Systems10.1109/SEAMS.2017.15(175-181)Online publication date: 20-May-2017
https://dl.acm.org/doi/10.1109/SEAMS.2017.15
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents