Abstract
Repeating research in computer science requires more than just code and data: it requires an appropriate environment in which to run experiments. In some cases, this environment appears fairly straightforward: it consists of a particular operating system and set of required libraries. In many cases, however, it is considerably more complex: the execution environment may be an entire network, may involve complex and fragile configuration of the dependencies, or may require large amounts of resources in terms of computation cycles, network bandwidth, or storage. Even the "straightforward" case turns out to be surprisingly intricate: there may be explicit or hidden dependencies on compilers, kernel quirks, details of the ISA, etc. The result is that when one tries to repeat published results, creating an environment sufficiently similar to one in which the experiment was originally run can be troublesome; this problem only gets worse as time passes. What the computer science community needs, then, are environments that have the explicit goal of enabling repeatable research. This paper outlines the problem of repeatable research environments, presents a set of requirements for such environments, and describes one facility that attempts to address them.
- J. Albrecht and D. Y. Huang. Managing distributed applications using Gush. In Proceedings of the Sixth International Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities (TridentCom), May 2010.Google Scholar
- M. Berman, J. S. Chase, L. Landweber, A. Nakao, M. Ott, D. Raychaudhuri, R. Ricci, and I. Seskar. GENI: A federated testbed for innovative network experiments. Computer Networks, 61(0):5--23, Mar. 2014. Google ScholarDigital Library
- J. Cappos, S. Baker, J. Plichta, D. Nyugen, J. Hardies, M. Borgard, J. Johnston, and J. Hartman. Stork: Package management for distributed VM environments. In The 21st Large Installation System Administration Conference (LISA), Nov. 2007. Google ScholarDigital Library
- C. Collberg, T. Proebsting, G. Moraila, A. Shankaran, Z. Shi, and A. M. Warren. Measuring reproducibility in computer systems research. Technical report, University of Arizona, Mar. 2013.Google Scholar
- Docker. https://www.docker.com/.Google Scholar
- The dpkg package manager. https://wiki.debian.org/Teams/Dpkg.Google Scholar
- E. Eide, L. Stoller, and J. Lepreau. An experimentation workbench for replayable networking research. In Proceedings of the 4th USENIX Symposium on Networked Systems Design and Implementation, Oct. 2007. Google ScholarDigital Library
- Elsevier Ltd. The executable paper grand challenge. http://www.executablepapers.com/, 2011.Google Scholar
- Examining "Reproducibility in Computer Science". http://cs.brown.edu/~sk/Memos/ Examining-Reproducibility/.Google Scholar
- P. V. Gorp and S. Mazanek. SHARE: A web portal for creating and sharing executable research papers. Procedia Computer Science, 4(0):589--597, 2011.Google ScholarCross Ref
- J. Griffioen, Z. Fei, H. Nasir, X. Wu, J. Reed, and C. Carpenter. Measuring experiments in GENI. Computer Networks, 63(0):17--32, 2014.Google ScholarCross Ref
- P. J. Guo. CDE: A tool for creating portable experimental software packages. Computing in Science and Engineering, 14(4):32--35, 2012. Google ScholarDigital Library
- P. J. Guo and M. Seltzer. Burrito: Wrapping your lab notebook in computational infrastructure. In Proceedings of the 4th USENIX Workshop on the Theory and Practice of Provenance. USENIX Association, 2012. Google ScholarDigital Library
- M. Hibler, L. Stoller, J. Lepreau, R. Ricci, and C. Barb. Fast, scalable disk imaging with Frisbee. In Proceedings of the USENIX Annual Technical Conference. USENIX, June 2003.Google Scholar
- R. Jain. The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling. Wiley, 1991.Google Scholar
- Labwiki. http://labwiki.mytestbed.net/.Google Scholar
- J. Liu, S. Mann, N. V. Vorst, and K. Hellman. An open and scalable emulation infrastructure for large-scale real-time network simulations. In INFOCOM 2007 MiniSymposium, May 2007.Google ScholarDigital Library
- T. Mytkowicz, A. Diwan, M. Hauswirth, and P. F. Sweeney. Producing wrong data without doing anything obviously wrong! In Proceedings of the Fourteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Mar. 2009. Google ScholarDigital Library
- P. Nowakowski, E. Ciepiela, D. Haręzlak, J. Kocot, M. Kasztelnik, T. Bartyński, J. Meizner, G. Dyk, and M. Malawski. The collage authoring environment. Procedia Computer Science, 4(0):608--617, 2011.Google ScholarCross Ref
- ns-3. http://www.nsnam.org/.Google Scholar
- OCCAM: Open curation for computer architecture modeling. http://www.occamportal.org/.Google Scholar
- A. Oliveira, J.-C. Petkovich, T. Reidemeister, and S. Fischmeister. Datamill: Rigorous performance evaluation made easy. In Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering (ICPE), pages 137--149, Prague, Czech Republic, April 2013. Google ScholarDigital Library
- Puppet. http://puppetlabs.com/.Google Scholar
- S. Qunlan and S. Dorward. Venti: A new approach to archival storage. In Proceedings of the Conference on File and Storage Technologies (FAST), Jan. 2002. Google ScholarDigital Library
- T. Rakotoarivelo, M. Ott, I. Seskar, and G. Jourjon. OMF: a control and management framework for networking testbeds. In SOSP Workshop on Real Overlays and Distributed Systems (ROADS), Oct. 2009.Google Scholar
- Resource specification (RSpec) documents in GENI. http://groups.geni.net/geni/wiki/GENIExperimenter/RSpecs.Google Scholar
- The RPM package manager. http://www.rpm.org.Google Scholar
- S. Schwab, B. Wilson, C. Ko, and A. Hussain. SEER: A security experimentation environment for DETER. In Proceedings of the DETER Community Workshop on Cyber Security Experimentation and Test, Aug. 2007. Google ScholarDigital Library
- J. Vitek and T. Kalibera. Repeatability, reproducibility and rigor in systems research. In Proceedings of the International Conference on Embedded Software (EMSOFT), Oct. 2011. Google ScholarDigital Library
- B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, and A. Joglekar. An integrated experimental environment for distributed systems and networks. In Proceedings of the USENIX Symposium on Operating System Design and Implementation (OSDI). USENIX, Dec. 2002. Google ScholarDigital Library
- Wind River Systems. Simics full system simulator. http://www.windriver.com/products/simics/.Google Scholar
Recommendations
The Effects of Trust-Assuring Arguments on Consumer Trust in Internet Stores: Application of Toulmin's Model of Argumentation
A trust-assuring argument refers to “a claim and its supporting statements used in an Internet store to address trust-related issues.” Although trust-assuring arguments often appear in Internet stores, little research has been conducted to understand ...
Reasoning about the appropriateness of proponents for arguments
AAAI'08: Proceedings of the 23rd national conference on Artificial intelligence - Volume 1Formal approaches to modelling argumentation provide ways to present arguments and counterarguments, and to evaluate which arguments are, in a formal sense, warranted. While these proposals allow for evaluating object-level arguments and ...
Computing Arguments and Attacks in Assumption-Based Argumentation
CaSAPI (Credulous and Skeptical Argumentation: Prolog Implementation) 3.0 determines the acceptability of claims, using the general-purpose framework of assumption-based argumentation, under the semantics of admissible extensions. This framework reduces ...
Comments