ABSTRACT
Innovation and technology in computer science education is driven by research and practice. Both of these activities involve the gathering and analysis of data in order to develop new tools, methods including software, or strategies to solve recent challenges in the field. However, data as basis for any new solution is hardly shared, reused and recognized. This is due to the fact that the publication of research data encompasses a number of challenges for researchers, while benefits of publishing data remain low. As a result, further analyses of data as part of secondary research are uncommon in the computer science education community. Therefore, the authors of this position paper critically reflect on current practices related to the publication of research data in this community. Moreover, a path forward is outlined for future conferences, such as ITiCSE, to become increasingly FAIR, and open with regard to research data.
- Libby Bishop and Arja Kuula-Luumi. 2017. Revisiting Qualitative Data Reuse: A Decade On. SAGE Open 7, 1 (2017), 2158244016685136. https://doi.org/10.1177/2158244016685136Google ScholarCross Ref
- Jeremiah Blanchard, John R. Hott, Vincent Berry, Rebecca Carroll, Bob Edmison, Richard Glassey, Oscar Karnalim, Brian Plancher, and Seán Russell. 2022. Leveraging Community Software in CS Education to Avoid Reinventing the Wheel. In Proceedings of the 27th ACM Conference on on Innovation and Technology in Computer Science Education Vol. 2 (Dublin, Ireland) (ITiCSE '22). Association for Computing Machinery, New York, 580--581. https://doi.org/10.1145/3502717.3532169Google ScholarDigital Library
- Christine L Borgman. 2017. Big data, little data, no data: Scholarship in the networked world. MIT press.Google ScholarDigital Library
- Christine L. Borgman and Irene V. Pasquetto. 2017. Why Data Sharing and Reuse Are Hard To Do. https://escholarship.org/uc/item/0jj17309Google Scholar
- Neil Christopher Charles Brown, Michael Kölling, Davin McCall, and Ian Utting. 2014. Blackbox: A Large Scale Repository of Novice Programmers' Activity. In Proceedings of the ACM Technical Symposium on Computer Science Education (SIGCSE). https://doi.org/10.1145/2538862.2538924Google ScholarDigital Library
- Arturo Casadevall and Ferric C. Fang. 2010. Reproducible Science. Infection and Immunity 78, 12 (2010), 4972--4975. https://doi.org/10.1128/IAI.00908--10 arXiv:https://journals.asm.org/doi/pdf/10.1128/IAI.00908--10Google ScholarCross Ref
- European Union. 2023. European Open Science Cloud. https://eosc-portal.eu/Google Scholar
- Center for Open Science. 2023. OSF Home. online. https://osf.io/4znzp/Google Scholar
- Erin D Foster and Ariel Deardorff. 2017. Open science framework (OSF). Journal of the Medical Library Association: JMLA 105, 2 (2017), 203.Google ScholarCross Ref
- GO Fair. 2016. Fair Principles. https://www.go-fair.org/fair-principles/Google Scholar
- Jeremy Goecks, Anton Nekrutenko, and James Taylor. 2010. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome biology 11, 8 (2010), 1--13.Google Scholar
- Alejandra González-Beltrán, Peter Li, Jun Zhao, Maria Susana Avila-Garcia, Marco Roos, Mark Thompson, Eelke van der Horst, Rajaram Kaliyaperumal, Ruibang Luo, Tin-Lap Lee, et al. 2015. From peer-reviewed to peer-reproduced in scholarly publishing: the complementary roles of data models and workflows in bioinformatics. PLOS one 10, 7 (2015), e0127612.Google ScholarCross Ref
- Petri Ihantola, Arto Vihavainen, Alireza Ahadi, Matthew Butler, Jürgen Börstler, Stephen H. Edwards, Essi Isohanni, Ari Korhonen, Andrew Petersen, Kelly Rivers, Miguel Ángel Rubio, Judy Sheard, Bronius Skupas, Jaime Spacco, Claudia Szabo, and Daniel Toll. 2015. Educational Data Mining and Learning Analytics in Programming: Literature Review and Case Studies. In Proceedings of the 2015 ITiCSE on Working Group Reports (ITICSE-WGR '15). Association for Computing Machinery, New York, 41--63. https://doi.org/10.1145/2858796.2858798Google ScholarDigital Library
- ITiCSE. 2023. Call for papers. https://iticse.acm.org/2023/call-for-papers/Google Scholar
- ITiCSE. 2023. Paper Review Process. https://iticse.acm.org/2023/paper-review-process/Google Scholar
- Johan Jeuring, Hieke Keuning, Samiha Marwan, Dennis Bouvier, Cruz Izu, Natalie Kiesler, Teemu Lehtinen, Dominic Lohr, Andrew Petersen, and Sami Sarsa. 2022. Steps Learners Take When Solving Programming Tasks, and How Learning Environments (Should) Respond to Them. In Proceedings of the 27th ACM Conference on Innovation and Technology in Computer Science Education Vol. 2 (Dublin, Ireland) (ITiCSE '22). Association for Computing Machinery, New York, 570--571. https://doi.org/10.1145/3502717.3532168Google ScholarDigital Library
- Johan Jeuring, Hieke Keuning, Samiha Marwan, Dennis Bouvier, Cruz Izu, Natalie Kiesler, Teemu Lehtinen, Dominic Lohr, Andrew Peterson, and Sami Sarsa. 2022. Towards Giving Timely Formative Feedback and Hints to Novice Programmers. In Proceedings of the 2022 Working Group Reports on Innovation and Technology in Computer Science Education (Dublin, Ireland) (ITiCSE-WGR '22). Association for Computing Machinery, New York, 95--115. https://doi.org/10.1145/3571785.3574124Google ScholarDigital Library
- Daniel S Katz, Morane Gruenpeter, and Tom Honeyman. 2021. Taking a fresh look at FAIR for research software. Patterns 2, 3 (2021), 100222.Google ScholarCross Ref
- Daniel S Katz, Fotis E Psomopoulos, and Leyla Jael Castro. 2021. Working Towards Understanding the Role of FAIR for Machine Learning.. In DaMaLOS. 1--6.Google Scholar
- Lorenz Kemper, Gerrit Vorhoff, and Berthold U. Wigger. 2020. Predicting student dropout: A machine learning approach. European Journal of Higher Education 10, 1 (2020), 28--47. https://doi.org/10.1080/21568235.2020.1718520Google ScholarCross Ref
- Hieke Keuning, Johan Jeuring, and Bastiaan Heeren. 2016. Towards a systematic review of automated feedback generation for programming exercises. Annual Conference on Innovation and Technology in Computer Science Education, ITiCSE, 41--46. https://doi.org/10.1145/2899415.2899422Google ScholarDigital Library
- Natalie Kiesler. 2022. Dataset: Recursive problem solving in the online learning environment CodingBat by computer science students. https://doi.org/10.21249/DZHW:studentsteps:1.0.0Google ScholarCross Ref
- Natalie Kiesler. 2022. Daten- und Methodenbericht Rekursive Problemlösung in der Online Lernumgebung CodingBat durch Informatik-Studierende. Technical Report. https://metadata.fdz.dzhw.eu/public/files/data-packages/stu-studentsteps$/attachments/studentsteps_Data_Methods_Report_de.pdfGoogle Scholar
- Natalie Kiesler. 2022. Kompetenzförderung in der Programmierausbildung durch Modellierung von Kompetenzen und informativem Feedback. Dissertation. Johann Wolfgang Goethe-Universität, Frankfurt am Main. Fachbereich Informatik und Mathematik.Google Scholar
- Natalie Kiesler. 2022. Mental Models of Recursion: A Secondary Analysis of Novice Learners' Steps and Errors in Java Exercises. In Psychology of Programming Interest Group 2022 -- 33rd Annual Workshop. 226--240. https://www.ppig.org/papers/2022-ppig-33rd-kiesler/Google Scholar
- Natalie Kiesler, Bonnie K. Mackellar, Amruth N. Kumar, Renée McCauley, Rajendra K. Raj, Mihaela Sabin, and John Impagliazzo. 2023. Computing Students' Understanding of Dispositions: A Qualitative Study. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education Vol. 1 (Turku, Finland) (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3587102.3588797Google ScholarDigital Library
- Natalie Kiesler and Daniel Schiffner. 2022. On the Lack of Recognition of Software Artifacts and IT Infrastructure in Educational Technology Research. In 20. Fachtagung Bildungstechnologien (DELFI), Peter A. Henning, Michael Striewe, and Matthias Wölfel (Eds.). Gesellschaft für Informatik e.V., Bonn, 201--206. https://doi.org/10.18420/delfi2022-034Google ScholarCross Ref
- Natalie Kiesler and Carsten Thorbrügge. 2022. A Comparative Study of Programming Competencies in Vocational Training and Higher Education. In Proceedings of the 27th ACM Conference on on Innovation and Technology in Computer Science Education Vol. 1 (Dublin, Ireland) (ITiCSE '22). ACM, New York, 214--220. https://doi.org/10.1145/3502718.3524818Google ScholarDigital Library
- Natalie Kiesler and Carsten Thorbrügge. 2023. Socially Responsible Programming in Computing Education and Expectations in the Profession. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education Vol. 1 (Turku, Finland) (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3587102.3588839Google ScholarDigital Library
- Michael Kölling and Ian Utting. 2012. Building an Open, Large-Scale Research Data Repository of Initial Programming Student Behaviour. In Proceedings of the 43rd ACM Technical Symposium on Computer Science Education (Raleigh, North Carolina, USA) (SIGCSE '12). Association for Computing Machinery, New York, 323--324. https://doi.org/10.1145/2157136.2157234Google ScholarDigital Library
- Fidan Limani, Roland Johannes, Yudong Zhang, and Daniel Schiffner. 2022. KonsortSWD Task Area 5 Measure 3: Milestone 1 & 2 Report. https://doi.org/10.5281/zenodo.6497190Google ScholarCross Ref
- Ana Lucic and Catherine Blake. 2016. Preparing a Workforce to Effectively Reuse Data. In Proceedings of the 79th ASIS&T Annual Meeting: Creating Knowledge, Enhancing Lives through Information & Technology. American Society for Information Science, USA, Article 75.Google ScholarCross Ref
- Andrew Luxton-Reilly, Simon, Ibrahim Albluwi, Brett A. Becker, Michail Giannakos, Amruth N. Kumar, Linda Ott, James Paterson, Michael James Scott, Judy Sheard, and Claudia Szabo. 2018. Introductory Programming: A Systematic Literature Review. In Proceedings Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education. ACM, New York, 55--106.Google ScholarDigital Library
- Susana Masapanta-Carrión and J. Ángel Velázquez-Iturbide. 2018. A Systematic Review of the Use of Bloom's Taxonomy in Computer Science Education. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education. Association for Computing Machinery, New York, 441--446. https://doi.org/10.1145/3159450.3159491Google ScholarDigital Library
- Merinda McLure, Allison V Level, Catherine L Cranston, Beth Oehlerts, and Mike Culbertson. 2014. Data Curation: A Study of Researcher Practices and Needs. portal: Libraries and the Academy 14, 2 (2014), 139--164.Google ScholarCross Ref
- Microsoft. 2018. Microsoft Code-Hunt. online. https://github.com/Microsoft/ Code-HuntGoogle Scholar
- Barend Mons, Herman van Haagen, Christine Chichester, Peter-Bram't Hoen, Johan T den Dunnen, Gertjan van Ommen, Erik van Mulligen, Bharat Singh, Rob Hooft, Marco Roos, et al. 2011. The value of data. Nature genetics 43, 4 (2011), 281--283.Google Scholar
- National Science Foundation. 2013. Open Data at NSF. https://www.nsf.gov/data/Google Scholar
- University of California Curation Center. 2011. DMP Tool. https://dmptool.org/Google Scholar
- Carole L Palmer, Allen H Renear, and Melissa H Cragin. 2008. Purposeful curation: Research and education for a future with working data. (2008).Google Scholar
- Dirk Pilat and Yukiko Fukasaku. 2007. OECD principles and guidelines for access to research data from public funding. Data Science Journal 6 (2007), OD4--OD11.Google Scholar
- Keith Quille and Keith Nolan. 2022. Predicting Success in CS1 - An Open Access Data Project. In Proceedings of the 53rd ACM Technical Symposium on Computer Science Education V. 2. ACM, New York, 1126. https://doi.org/10.1145/3478432.3499092Google ScholarDigital Library
- Arto Vihavainen, Jonne Airaksinen, and Christopher Watson. 2014. A Systematic Review of Approaches for Teaching Introductory Programming and Their Influence on Success. In Proceedings of the Tenth Annual Conference on International Computing Education Research. ACM, New York, 19--26.Google ScholarDigital Library
- Mark D Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, Jan-Willem Boiten, Luiz Bonino da Silva Santos, Philip E Bourne, et al . 2016. The FAIR Guiding Principles for scientific data management and stewardship. Scientific data 3, 1 (2016), 1--9.Google Scholar
Index Terms
- Why We Need Open Data in Computer Science Education Research
Recommendations
Where's the Data? Exploring Datasets in Computing Education
CompEd 2023: Proceedings of the ACM Conference on Global Computing Education Vol 2This working group aims to identify available datasets within the context of computing education research. One particular area of interest is programming education, and the data in question may include students' steps, progress, or submissions in the ...
Supporting K-12 computer science education
The Computer Science Teachers Association (CSTA) supports and promotes the teaching of computer science and other computing disciplines at the K-12 educational level. During this presentation we will explore the issues for K-12 computer teachers, the ...
Similarities of open data and open source: impacts on business
What are the similarities of open data and open source software when building a business? Despite their differences as phenomena (one is about applications and one is about data), the two also have many similarities. Both for example share the idea that ...
Comments