Skip to main content

Making Reproducible Research Simple Using RMarkdown and the OSF

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12194))

Abstract

The replication crisis has further eroded the public’s trust in science. Many famous studies, even published in renowned journals, fail to produce the same results when replicated by other researchers. While this is the outcome of several problems in research, one aspect has gotten critical attention—reproducibility. The term reproducible research refers to studies that contain all materials necessary to reproduce the scientific results by other researchers. This allows others to identify flaws in calculations and improve scientific rigor. In this paper, we show a workflow for reproducible research using the R language and a set of additional packages and tools that simplify a reproducible research procedure.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://cran.r-project.org/web/views/ReproducibleResearch.html.

  2. 2.

    https://www.markdowntutorial.com/.

  3. 3.

    https://github.com/rocker-org/rocker.

  4. 4.

    (https://sumidu.github.io/reproducibleR/).

References

  1. Aggarwal, C.C., Philip, S.Y.: A general survey of privacy-preserving data mining models and algorithms. In: Aggarwal, C.C., Yu, P.S. (eds.) Privacy-preserving data mining, vol. 34, pp. 11–52. Springer, Heidelberg (2008). https://doi.org/10.1007/978-0-387-70992-5_2

    Chapter  Google Scholar 

  2. Aust, F.: citr: RStudio Add-in to Insert Markdown Citations. R package version 0.3.2. (2019). https://CRAN.R-project.org/package=citr

  3. Baker, M.: Reproducibility crisis. Nature 533(26), 353–66 (2016)

    Google Scholar 

  4. Barnier, J.: rmdformats: HTML Output Formats and Templates for ‘rmarkdown’ Documents. R package version 0.3.6. (2019). https://CRAN.R-project.org/package=rmdformats

  5. Barnier, J., Briatte, F., Larmarange, J.: questionr: Functions to Make Surveys Processing Easier. R package version 0.7.0. (2018). https://CRAN.R-roject.org/package=questionr

  6. Bryan, J.: Excuse me, do you have a moment to talk about version control? Am. Stat. 72(1), 20–27 (2018)

    Article  MathSciNet  Google Scholar 

  7. Valdez, A.C.: rmdtemplates: rmdtemplates - an opinionated collection of rmarkdown templates. R package version 0.4.0.0000. (2020). https://github.com/statisticsforsocialscience/rmd_templates

  8. Chang, W.: webshot: Take Screenshots of Web Pages. R package version 0.5.2. (2019). https://CRAN.R-project.org/package=webshot

  9. Colquhoun, D.: The reproducibility of research and the misinterpretation of p-values. Roy. Soc. Open Sci. 4(12), 171085 (2017)

    Article  MathSciNet  Google Scholar 

  10. Dumas, J., Marwick, B., Shotwell, G.: gramr: The Grammar of Grammar. R package version 0.0.0.9000. (2020). https://github.com/ropenscilabs/gramr

  11. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). https://doi.org/10.1007/11681878_14

    Chapter  Google Scholar 

  12. Gentleman, R., Lang, D.T.: Statistical analyses and reproducible research. J. Comput. Graph. Stat. 16(1), 1–23 (2007)

    Article  MathSciNet  Google Scholar 

  13. Head, M.L., et al.: The extent and consequences of p-hacking in science. PLoS Biol. 13(3), e1002106 (2015)

    Article  Google Scholar 

  14. Hendricks, P.: anonymizer: Anonymize data containing personally identifiable information. R package version 0.2.2. (2020). https://github.com/paulhendricks/anonymizer

  15. Iannone, R.: DiagrammeR: Graph/Network Visualization. R package version 1.1.0. (2020). https://github.com/rich-iannone/DiagrammeR

  16. Kerr, N.L.: HARKing: hypothesizing after the results are known. Pers. Soc. Psychol. Rev. 2(3), 196–217 (1998)

    Article  Google Scholar 

  17. Landau, W.M.: drake: A Pipeline Toolkit for Reproducible Computation at Scale. R package version 7.10.0. (2020). https://CRAN.Rproject.org/package=drake

  18. Lee, J., Clifton, C.: How much is enough? choosing e for differential privacy. Inf. Secur. 7001, 325–340 (2011)

    Google Scholar 

  19. Li, N., Li, T., Venkatasubramanian, S.: t-closeness: Privacy beyond k-anonymity and l-diversity. In: 2007 IEEE 23rd International Conference on Data Engineering, pp. 106–115. IEEE (2007)

    Google Scholar 

  20. Machanavajjhala, A., et al.: l-diversity: privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data (TKDD) 1(1), 3 (2007)

    Article  MathSciNet  Google Scholar 

  21. Marwick, B.: rrtools: Creates a Reproducible Research Compendium. R package version 0.1.0. (2019). https://github.com/benmarwick/rrtools

  22. Marwick, B., Boettiger, C., Mullen, L.: Packaging data analytical work reproducibly using R (and friends). Am. Stat. 72(1), 80–88 (2018)

    Article  MathSciNet  Google Scholar 

  23. Meyerm, F., Perrier, V.: esquisse: Explore and Visualize Your Data Interactively. R package version 0.3.0. (2020). https://CRAN.Rproject.org/package=esquisse

  24. Meyers, N.K.: Reproducible Research and the Open Science Framework (2017). https://osf.io/458u9/

  25. Müller, K.: here: A Simpler Way to Find Your Files. R package version 0.1. (2017). https://CRAN.R-project.org/package=here

  26. Open Science Collaboration et al.: Estimating the reproducibility of psychological science. Science 349(6251), aac4716 (2015)

    Google Scholar 

  27. Patil, I.: ggstatsplot: “ggplot2” Based Plots with Statistical Details. R package version 0.2.0. (2020). https://CRAN.R-project.org/package=ggstatsplot

  28. Revelle, W.: psych: Procedures for Psychological, Psychometric, and Personality Research. R package version 1.9.12.31. (2020). https://CRAN.R-project.org/package=psych

  29. Simonsohn, U., Nelson, L.D., Simmons, J.P.: p-curve and effect size: correcting for publication bias using only significant results. Perspect. Psychol. Sci. 9(6), 666–681 (2014)

    Article  Google Scholar 

  30. Templ, M., Meindl, B., Kowarik, A.: sdcMicro: Statistical Disclosure Control Methods for Anonymization of Data and Risk Estimation. R package version 5.5.1. (2020). https://CRAN.Rproject.org/package=sdcMicro

  31. Ushey, K.: renv: Project Environments. R package version 0.9.3-30. (2020). https://rstudio.github.io/renv

  32. Ushey, K., et al.: packrat: A Dependency Management System for Projects and their R Package Dependencies. R package version 0.5.0. (2018). https://CRAN.R-project.org/package=packrat

  33. Wickham, H.: forcats: Tools for Working with Categorical Variables (Factors) (2020). http://forcats.tidyverse.org, https://github.com/tidyverse/forcats

  34. Wickham, H.: tidyverse: Easily Install and Load the ‘Tidyverse’. R package version 1.3.0. (2019). https://CRAN.R-project.org/package=tidyverse

  35. Wickham, H., Bryan, J.: usethis: Automate Package and Project Setup. R package version 1.5.1. (2019). https://CRAN.Rproject.org/package=usethis

  36. Wickham, H., Seidel, D.: scales: Scale Functions for Visualization. R package version 1.1.0. (2019). https://CRAN.R-project.org/package=scales

  37. Wilson, G., et al.: Good enough practices in scientific computing. PLoS Comput. Biol. 13(6), e1005510 (2017)

    Article  Google Scholar 

  38. Wolen, A., Hartgerink, C.: osfr: Interface to the ‘Open Science Framework’ (‘OSF’). R package version 0.2.8. (2020). https://CRAN.Rproject.org/package=osfr

  39. Xie, Y.: knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.28. (2020). https://CRAN.Rproject.org/package=knitr

  40. Zhu, H.: kableExtra: Construct Complex Table with ‘kable’ and Pipe Syntax. R package version 1.1.0. (2019). https://CRAN.R-project.org/package=kableExtra

Download references

Acknowledgements

This research was supported by the Digital Society research program funded by the Ministry of Culture and Science of the German State of North Rhine-Westphalia. We would further like to thank the authors of the packages we have used. We used the following packages to create this document: knitr [39], tidyverse [34], rmdformats [4], kableExtra [40], scales [36], psych [28], rmdtemplates [7], sdcMicro [30], webshot [8], here [25], DiagrammeR [15], citr [2], drake [17], esquisse [23], usethis [35], gramr [10], questionr [5], ggstatsplot [27].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to André Calero Valdez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Calero Valdez, A. (2020). Making Reproducible Research Simple Using RMarkdown and the OSF. In: Meiselwitz, G. (eds) Social Computing and Social Media. Design, Ethics, User Behavior, and Social Network Analysis. HCII 2020. Lecture Notes in Computer Science(), vol 12194. Springer, Cham. https://doi.org/10.1007/978-3-030-49570-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-49570-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-49569-5

  • Online ISBN: 978-3-030-49570-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics