skip to main content
10.1145/3368089.3409706acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article
Artifacts Evaluated & Functional / v1.1

Docable: evaluating the executability of software tutorials

Published:08 November 2020Publication History

ABSTRACT

The typical software tutorial includes step-by-step instructions for installing developer tools, editing files and code, and running commands. When these software tutorials are not executable, either due to missing instructions, ambiguous steps, or simply broken commands, their value is diminished. Non-executable tutorials impact developers in several ways, including frustrating learning experiences, and limiting usability of developer tools.

To understand to what extent software tutorials are executable---and why they may fail---we conduct an empirical study on over 600 tutorials, including nearly 15,000 code blocks. We find a naive execution strategy achieves an overall executability rate of only 26%. Even a human-annotation-based execution strategy---while doubling executability---still yields no tutorial that can successfully execute all steps. We identify several common executability barriers, ranging from potentially innocuous causes, such as interactive prompts requiring human responses, to insidious errors, such as missing steps and inaccessible resources. We validate our findings with major stakeholders in technical documentation and discuss possible strategies for improving software tutorials, such as providing accessible alternatives for tutorial takers, and investing in automated tutorial testing to ensure continuous quality of software tutorials.

Skip Supplemental Material Section

Supplemental Material

fse20main-p277-p-teaser.mp4

mp4

36.2 MB

fse20main-p277-p-video.mp4

mp4

223.3 MB

References

  1. Laura Beckwith, Cory Kissinger, Margaret Burnett, Susan Wiedenbeck, Joseph Lawrance, Alan Blackwell, and Curtis Cook. 2006. Tinkering and Gender in EndUser Programmers' Debugging. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '06). Association for Computing Machinery, New York, NY, USA, 231-240. https://doi.org/10.1145/1124772.1124808 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Andrew Begel and Thomas Zimmermann. 2014. Analyze This! 145 Questions for Data Scientists in Software Engineering. In Proceedings of the 36th International Conference on Software Engineering (Hyderabad, India) (ICSE 2014 ). Association for Computing Machinery, New York, NY, USA, 12-23. https://doi.org/10.1145/ 2568225.2568233 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Melanie Birks, Ysanne Chapman, and Karen Francis. 2008. Memoing in qualitative research: Probing data and processes. Journal of Research in Nursing 13, 1 (jan 2008 ), 68-75. https://doi.org/10.1177/1744987107081254 Google ScholarGoogle ScholarCross RefCross Ref
  4. Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology 3, 2 ( 2006 ), 77-101.Google ScholarGoogle Scholar
  5. Jennifer Brill and Yeonjeong Park. 2011. Evaluating Online Tutorials for University Faculty, Staf, and Students: The Contribution of Just-in-Time Online Resources to Learning and Performance. International Journal on E-Learning 10, 1 (January 2011 ), 5-26. https://www.learntechlib.org/p/33278Google ScholarGoogle Scholar
  6. Rylan Cottrell, Robert J. Walker, and Jörg Denzinger. 2008. Semi-Automating Small-Scale Source Code Reuse via Structural Correspondence. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering (Atlanta, Georgia) (SIGSOFT '08/FSE-16). Association for Computing Machinery, New York, NY, USA, 214-225. https://doi.org/10.1145/1453101.1453130 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. I. Drosos, P. J. Guo, and C. Parnin. 2017. HappyFace: Identifying and predicting frustrating obstacles for learning programming at scale. In 2017 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). 171-179. https: //doi.org/10.1109/VLHCC. 2017.8103465 Google ScholarGoogle ScholarCross RefCross Ref
  8. Denae Ford and Chris Parnin. 2015. Exploring Causes of Frustration for Software Developers. In Proceedings of the Eighth International Workshop on Cooperative and Human Aspects of Software Engineering (Florence, Italy) (CHASE '15). IEEE Press, 115-116.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Hideaki Hata, Christoph Treude, Raula Gaikovina Kula, and Takashi Ishio. 2019. 9.6 Million Links in Source Code Comments: Purpose, Evolution, and Decay. In Proceedings of the 41st International Conference on Software Engineering (Montreal, Quebec, Canada) ( ICSE '19). IEEE Press, Piscataway, NJ, USA, 1211-1221. https: //doi.org/10.1109/ICSE. 2019.00123 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Andrew Head, Jason Jiang, James Smith, Marti A. Hearst, and Björn Hartmann. 2020. Composing Flexibly-Organized Step-by-Step Tutorials from Linked Source Code, Snippets, and Outputs. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) ( CHI '20). Association for Computing Machinery, New York, NY, USA, Article 669, 12 pages. https://doi. org/3313831.3376798Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Sarah Heckman, Kathryn T. Stolee, and Christopher Parnin. 2018. 10 + Years of Teaching Software Engineering with Itrust: The Good, the Bad, and the Ugly. In Proceedings of the 40th International Conference on Software Engineering : Software Engineering Education and Training (Gothenburg, Sweden) (ICSE-SEET '18). ACM, New York, NY, USA, 1-4. https://doi.org/10.1145/3183377.3183393 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. E. Horton and C. Parnin. 2018. Gistable: Evaluating the Executability of Python Code Snippets on GitHub. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 217-227. https://doi.org/10.1109/ ICSME. 2018.00031 Google ScholarGoogle ScholarCross RefCross Ref
  13. Eric Horton and Chris Parnin. 2019. DockerizeMe: Automatic Inference of Environment Dependencies for Python Code Snippets. In Proceedings of the 41st International Conference on Software Engineering (Montreal, Quebec, Canada) ( ICSE '19). IEEE Press, 328-338. https://doi.org/10.1109/ICSE. 2019.00047 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Md Monir Hossain, Nima Mahmoudi, Changyuan Lin, Hamzeh Khazaei, and Abram Hindle. 2019. Executability of Python Snippets in Stack Overflow. arXiv preprint arXiv: 1907. 04908 ( 2019 ).Google ScholarGoogle Scholar
  15. Glenn D Israel. 1992. Sampling the evidence of extension program impact. Citeseer.Google ScholarGoogle Scholar
  16. Ada S. Kim and Amy J. Ko. 2017. A Pedagogical Analysis of Online Coding Tutorials. In Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education (Seattle, Washington, USA) ( SIGCSE '17). ACM, New York, NY, USA, 321-326. https://doi.org/10.1145/3017680.3017728 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Sean Kross and Philip J. Guo. 2019. End-User Programmers Repurposing EndUser Programming Tools to Foster Diversity in Adult End-User Programming Education. In Proceedings of the IEEE Symposium on Visual Languages and HumanCentric Computing ( VL/HCC) (VL/HCC ' 19 ).Google ScholarGoogle Scholar
  18. Benjamin Lafreniere, Tovi Grossman, and George Fitzmaurice. 2013. Community Enhanced Tutorials: Improving Tutorials with Multiple Demonstrations. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Paris, France) ( CHI '13). ACM, New York, NY, USA, 1779-1788. https://doi.org/10.1145/2470654.2466235 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. T. C. Lethbridge, J. Singer, and A. Forward. 2003. How software engineers use documentation: the state of the practice. IEEE Software 20, 6 (Nov 2003 ), 35-39. https://doi.org/10.1109/MS. 2003.1241364 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yvonna. S. Lincoln and Egon G. Guba. 1985. Naturalistic Inquiry. Sage Publications, Newbury Park, CA.Google ScholarGoogle Scholar
  21. Nora McDonald, Sarita Schoenebeck, and Andrea Forte. 2019. Reliability and Inter-Rater Reliability in Qualitative Research: Norms and Guidelines for CSCW and HCI Practice. Proc. ACM Hum.-Comput. Interact. 3, CSCW, Article 72 ( Nov. 2019 ), 23 pages. https://doi.org/10.1145/3359174 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Samim Mirhosseini and Chris Parnin. 2020. Opunit: Sanity Checks for Computing Environments. In Software Engineering Aspects of Continuous Development and New Paradigms of Software Production and Deployment, Jean-Michel Bruel, Manuel Mazzara, and Bertrand Meyer (Eds.). Springer International Publishing, Cham, 167-180.Google ScholarGoogle Scholar
  23. Alok Mysore and Philip J. Guo. 2017. Torta: Generating Mixed-Media GUI and Command-Line App Tutorials Using Operating-System-Wide Activity Tracing. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology (Québec City, QC, Canada) ( UIST '17). ACM, New York, NY, USA, 703-714. https://doi.org/10.1145/3126594.3126628 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Alok Mysore and Philip J. Guo. 2018. Porta: Profiling Software Tutorials Using Operating-System-Wide Activity Tracing. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology (Berlin, Germany) ( UIST '18). ACM, New York, NY, USA, 201-212. https://doi.org/10.1145/3242587.3242633 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Meiyappan Nagappan, Thomas Zimmermann, and Christian Bird. 2013. Diversity in Software Engineering Research. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (Saint Petersburg, Russia) (ESEC/FSE 2013 ). Association for Computing Machinery, New York, NY, USA, 466-476. https://doi.org/10.1145/2491411.2491415 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Mitchell J. Nathan, Kenneth R. Koedinger, and Martha W. Alibali. 2001. Expert Blind Spot : When Content Knowledge Eclipses Pedagogical Content Knowledge.Google ScholarGoogle Scholar
  27. C. Parnin, E. Helms, C. Atlee, H. Boughton, M. Ghattas, A. Glover, J. Holman, J. Micco, B. Murphy, T. Savor, M. Stumm, S. Whitaker, and L. Williams. 2017. The Top 10 Adages in Continuous Deployment. IEEE Software 34, 3 (May 2017 ), 86-95. https://doi.org/10.1109/MS. 2017.86 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Parnin, C. Treude, and M. A. Storey. 2013. Blogging developer knowledge: Motivations, challenges, and future directions. In 2013 21st International Conference on Program Comprehension (ICPC). 211-214. https://doi.org/10.1109/ICPC. 2013. 6613850 Google ScholarGoogle ScholarCross RefCross Ref
  29. João Felipe Pimentel, Leonardo Murta, Vanessa Braganholo, and Juliana Freire. 2019. A Large-scale Study about Quality and Reproducibility of Jupyter Notebooks. In Proceedings of the 16th International Conference on Mining Software Repositories (Montreal, Canada) ( MSR '19).Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Joseph Ponterotto. 2006. Brief note on the origins, evolution, and meaning of the qualitative research concept thick description. The Qualitative Report 11, 3 ( 2006 ).Google ScholarGoogle Scholar
  31. Daniele Procida. 2017. What nobody tells you about documentation. https: //www.divio.com/blog/documentation/Google ScholarGoogle Scholar
  32. Nischal Shrestha, Colton Botta, Titus Barik, and Chris Parnin. [n.d.]. Here We Go Again: Why Is It Dificult for Developers to Learn Another Programming Language? ([n. d.]).Google ScholarGoogle Scholar
  33. Donna Spencer. 2009. Card sorting: Designing usable categories. Rosenfeld Media.Google ScholarGoogle Scholar
  34. Christoph Treude and Maurício Aniche. 2018. Where does Google find API documentation?. In Proceedings of the 2nd International Workshop on API Usage and Evolution. ACM, 19-22.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Hazel Virdó and Brian Hogan. 2020. Technical Writing Guidelines. https://www.digitalocean.com/community/tutorials/digitalocean-s-technicalwriting-guidelinesGoogle ScholarGoogle Scholar
  36. Yuhao Wu, Shaowei Wang, Cor-Paul Bezemer, and Katsuro Inoue. 2018. How do developers utilize source code from stack overflow? Empirical Software Engineering ( 2018 ), 1-37.Google ScholarGoogle Scholar
  37. Di Yang, Aftab Hussain, and Cristina Videira Lopes. 2016. From Query to Usable Code: An Analysis of Stack Overflow Code Snippets. In Proceedings of the 13th International Conference on Mining Software Repositories (Austin, Texas) ( MSR '16). ACM, New York, NY, USA, 391-402. https://doi.org/10.1145/2901739.2901767 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Docable: evaluating the executability of software tutorials

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ESEC/FSE 2020: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
      November 2020
      1703 pages
      ISBN:9781450370431
      DOI:10.1145/3368089

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 8 November 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate112of543submissions,21%

      Upcoming Conference

      FSE '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader