Skip to main content

Advertisement

Log in

Data science technology course: The design, assessment and computing environment perspectives

  • Published:
Education and Information Technologies Aims and scope Submit manuscript

Abstract

This article discusses the key elements of the Data Science Technology course offered to postgraduate students enrolled in the Master of Data Science program. This course complements the existing curriculum by providing the skills to handle the Big Data platform and tools, in addition to data science activities. We tackle the discussion about this course based on three main requirements, which are related to the need to exploit the key skills from two dimensions, namely, Data Science and Big Data, and the need for a cluster-based computing platform and its accessibility. We address these requirements by presenting the course design and its assessments, the configuration of the computing platform, and the strategy to enable flexible accessibility. In terms of course design, the offered course contributes to several innovative elements and has covered multiple key areas of the data science body of knowledge and multiple quadrants of the job and skills matrix. In the case of the computing platform, a stable deployment of a Hadoop cluster with flexible accessibility, triggered by the pandemic situation, has been established. Furthermore, through our experience with the implementation of the cluster, it has shown the ability of the cluster to handle computing problems with a larger dataset than the one used for the semesters within the scope of the study. We also provide some reflections and highlight future improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Availability of data and materials

The datasets generated during and/or analyzed during the current study are not publicly available due to security reasons, but are available from the corresponding author on reasonable request.

Abbreviations

AI:

- Artificial Intelligence

BDA:

- Big Data Analytics

BoK:

- Body of Knowledge

DSA:

- Data Science and Analytics

EDA:

- Exploratory Data Analysis

HDFS:

- Hadoop File System

IoT:

- Internet of Things

MCO:

- Movement Control Order

MDEC:

- Malaysia Digital Economy Corporation

PC:

- Personal Computer

UiTM:

- Universiti Teknologi MARA

VM:

- Virtual Machine

YARN:

- Yet Another Resource Negotiator

References

  • Adams, J. C. (2020). Creating a balanced data science program. In Annual Conference on Innovation and Technology in Computer Science Education, ITiCSE, Association for Computing Machinery, pp. 185-191.

  • Bart, A.C., Kafura, D., Shaffer, C. A., & Tilevich, E. (2018). Reconciling the promise and pragmatics of enhancing computing pedagogy with data science. In SIGCSE 2018 - Proceedings of the 49th ACM Technical Symposium on Computer Science Education, Association for Computing Machinery, Inc, vol 2018-January, pp 1029–1034.

  • Baumer, B. (2015). A data science course for undergraduates: Thinking with data. American Statistician, 69(4), 334–342.

    Article  MathSciNet  MATH  Google Scholar 

  • Brunner, R. J., & Kim, E. J. (2016). Teaching data science. In Procedia computer science, elsevier b.v., (Vol. 80 pp. 1947–1956).

  • Çetinkaya-Rundel, M., & Ellison, V. (2020). A fresh look at introductory data science. Journal of Statistics Education, 2021(S1), 16–26.

    Google Scholar 

  • Çetinkaya-Rundel, M., & Rundel, C. (2018). Infrastructure and tools for teaching computing throughout the statistical curriculum. American Statistician, 72(1), 58–65.

    Article  MathSciNet  MATH  Google Scholar 

  • Cuadrado-Gallego, J. J., & Demchenko, Y. (2020). Data science body of knowledge. In J. J. Cuadrado-gallego Y. Demchenko (Eds.) The Data Science Framework: A View from the EDISON Project (pp. 43–73). Cham: Springer International Publishing.

  • Demchenko, Y., & Cuadrado-Gallego, J. J. (2020). Data science competences. In J. J. Cuadrado-gallego Y. Demchenko (Eds.) The Data Science Framework: A View from the EDISON Project (pp. 9–41). Cham: Springer International Publishing.

  • DePratti, R., Dancik, G. M., Lucci, F., & Sampson, R. D. (2017). Development of an introductory big data programming and concepts course. Journal of Computing Sciences in Colleges, 32(6), 175– 182.

    Google Scholar 

  • Dichev, C., & Dicheva, D. (2017). Towards data science literacy. In Procedia computer science, elsevier b.v., (Vol. 108 pp. 2151–2160).

  • Dichev, C., Dicheva, D., Cassel, L., Goelman, D., & Posner, M. (2016). Preparing all students for the data-driven world. In Proceedings of the Symposium on Computing atMinority Institutions, ADMI.

  • Donoghue, T., Voytek, B., & Ellis, S. E. (2021). Teaching creative and practical data science at scale. Journal of Statistics and Data Science Education, 29(sup1), S27–S39.

    Article  Google Scholar 

  • Eckroth, J. (2016). Teaching big data with a virtual cluster. In Proceedings of the 47th ACM Technical Symposium on Computing Science Education, Association for Computing Machinery, New York, NY, USA, SIGCSE ’16, pp 175–180.

  • Eckroth, J. (2017). Teaching future big data analysts: Curriculum and experience report. In Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017, Institute of Electrical and Electronics Engineers Inc., pp 346–351.

  • Eckroth, J. (2018). A course on big data analytics. J Parallel Distrib Comput, 118, 166–176.

    Article  Google Scholar 

  • Eilks, I. (2018). Action research in science education: a twenty-year personal perspective. ARISE, 1(1), 3–14.

    Article  Google Scholar 

  • Fekete, A., Kay, J., & Röhm, U. (2021). A data-centric computing curriculum for a data science major. In SIGCSE 2021 - Proceedings of the 52nd ACM Technical Symposium on Computer Science Education, Association for Computing Machinery, Inc, pp 865–871.

  • Hicks, S. C., & Irizarry, R. A. (2018). A guide to teaching data science. American Statistician, 72(4), 382–391.

    Article  MathSciNet  MATH  Google Scholar 

  • Kross, S., & Guo, P. J. (2019). Practitioners teaching data science in industry and academia: Expectations, workflows, and challenges. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, ACM, New York, NY, USA, p 14.

  • Miller, S. (2017). The quant crunch: how the demand for data science skills is disrupting the job market.

  • Ngo, L. B., Duffy, E. B., & Apon, A. W. (2014). Teaching HDFS/MapReduce systems concepts to undergraduates. In Proceedings of the International Parallel and Distributed Processing Symposium, IPDPS, IEEE Computer Society, pp 1114–1121.

  • Oudshoorn, M. J., Titus, K. J., & Suchan, W. K. (2020). Building a new data science program based on an existing computer science program. In Proceedings - Frontiers in Education Conference, FIE, Institute of Electrical and Electronics Engineers Inc., vol 2020-October.

  • Salloum, M., Jeske, D., Ma, W., Papalexakis, V., Shelton, C., Tsotras, V., Zhou, S., & Shelton, C. T. (2021). Developing an interdisciplinary data science program; developing an interdisciplinary data science program. In Proceedings of the 52nd ACM Technical Symposium on Computer Science Education, ACM New York, NY, USA.

  • Shankar, A. C. (2021). MDEC’s commissioned study shows malaysia’s big data analytics market expected to grow to us$1.9b by 2025. https://www.theedgemarkets.com/article/mdecs-commissioned-study-shows-malaysias-big-data-analytics-market-expected-grow-us19b-2025.

  • Wiktorski, T., Demchenko, Y., & Cuadrado-Gallego, J. J. (2020). Data science curriculum. In J. J. Cuadrado-gallego Y. Demchenko (Eds.) The Data Science Framework: A View from the EDISON Project (pp. 75–108). Cham: Springer International Publishing.

Download references

Acknowledgements

We would like to take this opportunity to thank the School of Computing Sciences (formerly known as the Faculty of Computer and Mathematical Sciences), College of Computing, Informatics and Media, and Universiti Teknologi MARA (UiTM) for providing support to deploy the cluster for this course.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Azlan Ismail.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ismail, A., Mutalib, S. & Haron, H. Data science technology course: The design, assessment and computing environment perspectives. Educ Inf Technol 28, 10209–10234 (2023). https://doi.org/10.1007/s10639-022-11558-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10639-022-11558-8

Keywords