skip to main content
10.1145/3587102.3588860acmconferencesArticle/Chapter ViewAbstractPublication PagesiticseConference Proceedingsconference-collections
research-article

Why We Need Open Data in Computer Science Education Research

Published:30 June 2023Publication History

ABSTRACT

Innovation and technology in computer science education is driven by research and practice. Both of these activities involve the gathering and analysis of data in order to develop new tools, methods including software, or strategies to solve recent challenges in the field. However, data as basis for any new solution is hardly shared, reused and recognized. This is due to the fact that the publication of research data encompasses a number of challenges for researchers, while benefits of publishing data remain low. As a result, further analyses of data as part of secondary research are uncommon in the computer science education community. Therefore, the authors of this position paper critically reflect on current practices related to the publication of research data in this community. Moreover, a path forward is outlined for future conferences, such as ITiCSE, to become increasingly FAIR, and open with regard to research data.

References

  1. Libby Bishop and Arja Kuula-Luumi. 2017. Revisiting Qualitative Data Reuse: A Decade On. SAGE Open 7, 1 (2017), 2158244016685136. https://doi.org/10.1177/2158244016685136Google ScholarGoogle ScholarCross RefCross Ref
  2. Jeremiah Blanchard, John R. Hott, Vincent Berry, Rebecca Carroll, Bob Edmison, Richard Glassey, Oscar Karnalim, Brian Plancher, and Seán Russell. 2022. Leveraging Community Software in CS Education to Avoid Reinventing the Wheel. In Proceedings of the 27th ACM Conference on on Innovation and Technology in Computer Science Education Vol. 2 (Dublin, Ireland) (ITiCSE '22). Association for Computing Machinery, New York, 580--581. https://doi.org/10.1145/3502717.3532169Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Christine L Borgman. 2017. Big data, little data, no data: Scholarship in the networked world. MIT press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Christine L. Borgman and Irene V. Pasquetto. 2017. Why Data Sharing and Reuse Are Hard To Do. https://escholarship.org/uc/item/0jj17309Google ScholarGoogle Scholar
  5. Neil Christopher Charles Brown, Michael Kölling, Davin McCall, and Ian Utting. 2014. Blackbox: A Large Scale Repository of Novice Programmers' Activity. In Proceedings of the ACM Technical Symposium on Computer Science Education (SIGCSE). https://doi.org/10.1145/2538862.2538924Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Arturo Casadevall and Ferric C. Fang. 2010. Reproducible Science. Infection and Immunity 78, 12 (2010), 4972--4975. https://doi.org/10.1128/IAI.00908--10 arXiv:https://journals.asm.org/doi/pdf/10.1128/IAI.00908--10Google ScholarGoogle ScholarCross RefCross Ref
  7. European Union. 2023. European Open Science Cloud. https://eosc-portal.eu/Google ScholarGoogle Scholar
  8. Center for Open Science. 2023. OSF Home. online. https://osf.io/4znzp/Google ScholarGoogle Scholar
  9. Erin D Foster and Ariel Deardorff. 2017. Open science framework (OSF). Journal of the Medical Library Association: JMLA 105, 2 (2017), 203.Google ScholarGoogle ScholarCross RefCross Ref
  10. GO Fair. 2016. Fair Principles. https://www.go-fair.org/fair-principles/Google ScholarGoogle Scholar
  11. Jeremy Goecks, Anton Nekrutenko, and James Taylor. 2010. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome biology 11, 8 (2010), 1--13.Google ScholarGoogle Scholar
  12. Alejandra González-Beltrán, Peter Li, Jun Zhao, Maria Susana Avila-Garcia, Marco Roos, Mark Thompson, Eelke van der Horst, Rajaram Kaliyaperumal, Ruibang Luo, Tin-Lap Lee, et al. 2015. From peer-reviewed to peer-reproduced in scholarly publishing: the complementary roles of data models and workflows in bioinformatics. PLOS one 10, 7 (2015), e0127612.Google ScholarGoogle ScholarCross RefCross Ref
  13. Petri Ihantola, Arto Vihavainen, Alireza Ahadi, Matthew Butler, Jürgen Börstler, Stephen H. Edwards, Essi Isohanni, Ari Korhonen, Andrew Petersen, Kelly Rivers, Miguel Ángel Rubio, Judy Sheard, Bronius Skupas, Jaime Spacco, Claudia Szabo, and Daniel Toll. 2015. Educational Data Mining and Learning Analytics in Programming: Literature Review and Case Studies. In Proceedings of the 2015 ITiCSE on Working Group Reports (ITICSE-WGR '15). Association for Computing Machinery, New York, 41--63. https://doi.org/10.1145/2858796.2858798Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. ITiCSE. 2023. Call for papers. https://iticse.acm.org/2023/call-for-papers/Google ScholarGoogle Scholar
  15. ITiCSE. 2023. Paper Review Process. https://iticse.acm.org/2023/paper-review-process/Google ScholarGoogle Scholar
  16. Johan Jeuring, Hieke Keuning, Samiha Marwan, Dennis Bouvier, Cruz Izu, Natalie Kiesler, Teemu Lehtinen, Dominic Lohr, Andrew Petersen, and Sami Sarsa. 2022. Steps Learners Take When Solving Programming Tasks, and How Learning Environments (Should) Respond to Them. In Proceedings of the 27th ACM Conference on Innovation and Technology in Computer Science Education Vol. 2 (Dublin, Ireland) (ITiCSE '22). Association for Computing Machinery, New York, 570--571. https://doi.org/10.1145/3502717.3532168Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Johan Jeuring, Hieke Keuning, Samiha Marwan, Dennis Bouvier, Cruz Izu, Natalie Kiesler, Teemu Lehtinen, Dominic Lohr, Andrew Peterson, and Sami Sarsa. 2022. Towards Giving Timely Formative Feedback and Hints to Novice Programmers. In Proceedings of the 2022 Working Group Reports on Innovation and Technology in Computer Science Education (Dublin, Ireland) (ITiCSE-WGR '22). Association for Computing Machinery, New York, 95--115. https://doi.org/10.1145/3571785.3574124Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Daniel S Katz, Morane Gruenpeter, and Tom Honeyman. 2021. Taking a fresh look at FAIR for research software. Patterns 2, 3 (2021), 100222.Google ScholarGoogle ScholarCross RefCross Ref
  19. Daniel S Katz, Fotis E Psomopoulos, and Leyla Jael Castro. 2021. Working Towards Understanding the Role of FAIR for Machine Learning.. In DaMaLOS. 1--6.Google ScholarGoogle Scholar
  20. Lorenz Kemper, Gerrit Vorhoff, and Berthold U. Wigger. 2020. Predicting student dropout: A machine learning approach. European Journal of Higher Education 10, 1 (2020), 28--47. https://doi.org/10.1080/21568235.2020.1718520Google ScholarGoogle ScholarCross RefCross Ref
  21. Hieke Keuning, Johan Jeuring, and Bastiaan Heeren. 2016. Towards a systematic review of automated feedback generation for programming exercises. Annual Conference on Innovation and Technology in Computer Science Education, ITiCSE, 41--46. https://doi.org/10.1145/2899415.2899422Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Natalie Kiesler. 2022. Dataset: Recursive problem solving in the online learning environment CodingBat by computer science students. https://doi.org/10.21249/DZHW:studentsteps:1.0.0Google ScholarGoogle ScholarCross RefCross Ref
  23. Natalie Kiesler. 2022. Daten- und Methodenbericht Rekursive Problemlösung in der Online Lernumgebung CodingBat durch Informatik-Studierende. Technical Report. https://metadata.fdz.dzhw.eu/public/files/data-packages/stu-studentsteps$/attachments/studentsteps_Data_Methods_Report_de.pdfGoogle ScholarGoogle Scholar
  24. Natalie Kiesler. 2022. Kompetenzförderung in der Programmierausbildung durch Modellierung von Kompetenzen und informativem Feedback. Dissertation. Johann Wolfgang Goethe-Universität, Frankfurt am Main. Fachbereich Informatik und Mathematik.Google ScholarGoogle Scholar
  25. Natalie Kiesler. 2022. Mental Models of Recursion: A Secondary Analysis of Novice Learners' Steps and Errors in Java Exercises. In Psychology of Programming Interest Group 2022 -- 33rd Annual Workshop. 226--240. https://www.ppig.org/papers/2022-ppig-33rd-kiesler/Google ScholarGoogle Scholar
  26. Natalie Kiesler, Bonnie K. Mackellar, Amruth N. Kumar, Renée McCauley, Rajendra K. Raj, Mihaela Sabin, and John Impagliazzo. 2023. Computing Students' Understanding of Dispositions: A Qualitative Study. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education Vol. 1 (Turku, Finland) (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3587102.3588797Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Natalie Kiesler and Daniel Schiffner. 2022. On the Lack of Recognition of Software Artifacts and IT Infrastructure in Educational Technology Research. In 20. Fachtagung Bildungstechnologien (DELFI), Peter A. Henning, Michael Striewe, and Matthias Wölfel (Eds.). Gesellschaft für Informatik e.V., Bonn, 201--206. https://doi.org/10.18420/delfi2022-034Google ScholarGoogle ScholarCross RefCross Ref
  28. Natalie Kiesler and Carsten Thorbrügge. 2022. A Comparative Study of Programming Competencies in Vocational Training and Higher Education. In Proceedings of the 27th ACM Conference on on Innovation and Technology in Computer Science Education Vol. 1 (Dublin, Ireland) (ITiCSE '22). ACM, New York, 214--220. https://doi.org/10.1145/3502718.3524818Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Natalie Kiesler and Carsten Thorbrügge. 2023. Socially Responsible Programming in Computing Education and Expectations in the Profession. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education Vol. 1 (Turku, Finland) (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3587102.3588839Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Michael Kölling and Ian Utting. 2012. Building an Open, Large-Scale Research Data Repository of Initial Programming Student Behaviour. In Proceedings of the 43rd ACM Technical Symposium on Computer Science Education (Raleigh, North Carolina, USA) (SIGCSE '12). Association for Computing Machinery, New York, 323--324. https://doi.org/10.1145/2157136.2157234Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Fidan Limani, Roland Johannes, Yudong Zhang, and Daniel Schiffner. 2022. KonsortSWD Task Area 5 Measure 3: Milestone 1 & 2 Report. https://doi.org/10.5281/zenodo.6497190Google ScholarGoogle ScholarCross RefCross Ref
  32. Ana Lucic and Catherine Blake. 2016. Preparing a Workforce to Effectively Reuse Data. In Proceedings of the 79th ASIS&T Annual Meeting: Creating Knowledge, Enhancing Lives through Information & Technology. American Society for Information Science, USA, Article 75.Google ScholarGoogle ScholarCross RefCross Ref
  33. Andrew Luxton-Reilly, Simon, Ibrahim Albluwi, Brett A. Becker, Michail Giannakos, Amruth N. Kumar, Linda Ott, James Paterson, Michael James Scott, Judy Sheard, and Claudia Szabo. 2018. Introductory Programming: A Systematic Literature Review. In Proceedings Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education. ACM, New York, 55--106.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Susana Masapanta-Carrión and J. Ángel Velázquez-Iturbide. 2018. A Systematic Review of the Use of Bloom's Taxonomy in Computer Science Education. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education. Association for Computing Machinery, New York, 441--446. https://doi.org/10.1145/3159450.3159491Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Merinda McLure, Allison V Level, Catherine L Cranston, Beth Oehlerts, and Mike Culbertson. 2014. Data Curation: A Study of Researcher Practices and Needs. portal: Libraries and the Academy 14, 2 (2014), 139--164.Google ScholarGoogle ScholarCross RefCross Ref
  36. Microsoft. 2018. Microsoft Code-Hunt. online. https://github.com/Microsoft/ Code-HuntGoogle ScholarGoogle Scholar
  37. Barend Mons, Herman van Haagen, Christine Chichester, Peter-Bram't Hoen, Johan T den Dunnen, Gertjan van Ommen, Erik van Mulligen, Bharat Singh, Rob Hooft, Marco Roos, et al. 2011. The value of data. Nature genetics 43, 4 (2011), 281--283.Google ScholarGoogle Scholar
  38. National Science Foundation. 2013. Open Data at NSF. https://www.nsf.gov/data/Google ScholarGoogle Scholar
  39. University of California Curation Center. 2011. DMP Tool. https://dmptool.org/Google ScholarGoogle Scholar
  40. Carole L Palmer, Allen H Renear, and Melissa H Cragin. 2008. Purposeful curation: Research and education for a future with working data. (2008).Google ScholarGoogle Scholar
  41. Dirk Pilat and Yukiko Fukasaku. 2007. OECD principles and guidelines for access to research data from public funding. Data Science Journal 6 (2007), OD4--OD11.Google ScholarGoogle Scholar
  42. Keith Quille and Keith Nolan. 2022. Predicting Success in CS1 - An Open Access Data Project. In Proceedings of the 53rd ACM Technical Symposium on Computer Science Education V. 2. ACM, New York, 1126. https://doi.org/10.1145/3478432.3499092Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Arto Vihavainen, Jonne Airaksinen, and Christopher Watson. 2014. A Systematic Review of Approaches for Teaching Introductory Programming and Their Influence on Success. In Proceedings of the Tenth Annual Conference on International Computing Education Research. ACM, New York, 19--26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Mark D Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, Jan-Willem Boiten, Luiz Bonino da Silva Santos, Philip E Bourne, et al . 2016. The FAIR Guiding Principles for scientific data management and stewardship. Scientific data 3, 1 (2016), 1--9.Google ScholarGoogle Scholar

Index Terms

  1. Why We Need Open Data in Computer Science Education Research

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ITiCSE 2023: Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1
        June 2023
        694 pages
        ISBN:9798400701382
        DOI:10.1145/3587102

        Copyright © 2023 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 30 June 2023

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate552of1,613submissions,34%

        Upcoming Conference

        ITiCSE 2024

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader