Abstract
Interest in data science, especially within the context of graduate education, is exploding. In this study we present initial results from an ongoing qualitative study of an interdisciplinary cyberinfrastructure-focused NSF-funded graduate data science education workshop hosted at an iSchool in the US. The complexity of the workshop curriculum, the participants’ and instructors’ disparate disciplinary backgrounds, and the technical tools employed are particularly suited to qualitative methods which can synthesize all of these aspects from rich observational, ethnographic, and trace data collected as part of the authors’ role on the grant’s qualitative evaluation team. The success of the workshop in equipping participants to do reproducible computational science was in part due to the successful acculturation process, whereby participants comprehended, altered, and enacted new norms amongst themselves. At the same time, we observed potential challenges for data science instruction resulting from the rhetorical framing of the technologies as inescapably new. This language, which mirrors that of a successful grant proposal, tends to obscure the deeply embedded and contingent history of the command-line technologies required to preform computational science, many of which are decades old. We conclude by describing our ongoing work, future theoretical sampling plans from this and future data, and the contributions that our findings can provide to graduate data science curriculum development and pedagogy.
This work was partially supported by National Science Foundation grant 1730390.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Professional relevance in this context is not limited to academic science. A major theme we will address in future work is the role and involvement of industry in scientific training. Genomics researchers we talked to, for instance, noted that many of their peers completed their doctorate and went to work for, for instance, social media or finance companies.
- 2.
Vim was a clone of and improvement on vi, first released in 1976 as a visual mode improvement on the ex line editor program for UNIX systems. Vim itself was based upon the 1988 C code of an Amiga port of STEVIE (ST Editor for vi Enthusiasts), a 1987 vi clone for the Atari ST. Development of Vim has been near-constant since the 1990s, and new versions are released every few months.
- 3.
An example of this is the discontinuities between Python 2 and 3, which played a role in the workshop as a Participant 7 updated his old processing pipeline from Python 2 to Python 3 as part of the workshop’s project phase.
- 4.
FORTRAN stands for Formula Translator and was intended for easily translating mathematical formulas, which many scientists were familiar with, into compiled machine code, which they were not.
References
Adolph, S., Hall, W., Kruchten, P.: Using grounded theory to study the experience of software development. Empir. Softw. Eng. 16(4), 487–513 (2011)
Apramian, T., Cristancho, S., Watling, C., Lingard, L.: (Re)Grounding grounded theory: a close reading of theory in four schools. Qual. Res. (QR) 17(4), 359–376 (2016)
Bates, M.J.: The invisible substrate of information science. J. Am. Soc. Inf. Sci. 50(12), 1043–1050 (1999)
Charmaz, K.: Constructing Grounded Theory: A Practical Guide Trough Qualitative Analysis. Sage Publications, London (2006)
Geiger, R.S., Ribes, D.: Trace ethnography: following coordination through documentary practices. In: 2011 44th Hawaii International Conference on System Sciences, pp. 1–10, January 2011
Hicks, S.C., Irizarry, R.A.: A guide to teaching data science. Am. Stat. 72(4), 382–391 (2018)
Kampov-Polevoi, J., Hemminger, B.M.: A curricula-based comparison of biomedical and health informatics programs in the USA. J. Am. Med. Inform. Assoc. (JAMIA) 18(2), 195–202 (2011)
Kinnunen, P., Simon, B.: My program is ok - am I? Computing freshmen’s experiences of doing programming assignments. Comput. Sci. Educ. 22(1), 1–28 (2012)
Ortiz-Repiso, V., Greenberg, J., Calzada-Prado, J.: A cross-institutional analysis of data-related curricula in information science programmes: a focused look at the ischools. J. Inf. Sci. Eng. 44(6), 768–784 (2018)
Pang, A., Anslow, C., Noble, J.: What programming languages do developers use? A theory of static vs dynamic language choice. In: 2018 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), pp. 239–247, October 2018
Prechelt, L., Schmeisky, H., Zieris, F.: Quality experience: a grounded theory of successful agile projects without dedicated testers. In: 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), pp. 1017–1027, May 2016
Ribes, D.: Materiality methodology, and some tricks of the trade in the study of data and specimens. In: Vertesi, J., Ribes, D. (eds.) digitalSTS, pp. 43–60. Princeton University Press, Princeton (2019)
Ribes, D., Hoffman, A.S., Slota, S.C., Bowker, G.C.: The logic of domains. Soc. Stud. Sci. 49(3), 281–309 (2019)
Sawyer, S., Jarrahi, M.: Sociotechnical approaches to the study of information systems. In: Topi, H., Tucker, A. (eds.) Computing Handbook, vol. 19, 3rd edn, pp. 5-1–5-27. Chapman and Hall/CRC, London (2014)
Star, S.L.: Living grounded theory: cognitive and emotional forms of pragmatism. In: Bowker, G.C., Timmermans, S., Clarke, A.E., Balka, E. (eds.) Boundary Objects and Beyond: Working with Leigh Star, pp. 121–142. MIT Press, Cambridge (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Hauser, E., Sutherland, W. (2020). Temporality in Data Science Education: Early Results from a Grounded Theory Study of an NSF-Funded CyberTraining Workshop. In: Sundqvist, A., Berget, G., Nolin, J., Skjerdingstad, K. (eds) Sustainable Digital Communities. iConference 2020. Lecture Notes in Computer Science(), vol 12051. Springer, Cham. https://doi.org/10.1007/978-3-030-43687-2_43
Download citation
DOI: https://doi.org/10.1007/978-3-030-43687-2_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43686-5
Online ISBN: 978-3-030-43687-2
eBook Packages: Computer ScienceComputer Science (R0)