Abstract
The paper presents a brief technology survey of existing tools to implement data ingestion pipelines in a classical Data Science project. Given the emergent nature of technologies and the challenges associated with any Big Data project, we propose to identify and discuss the main components of a data pipeline, from a data engineering perspective. The data pipeline is showcased with a case study from an ICT university project, where several teams of master students competed towards designing and implementing the best solution for a manufacturing data pipeline. The project proposes a research-based multidisciplinary approach to education, aiming at empowering students with a novel role in the process of learning, that of knowledge creators. Therefore, on the one hand, the paper discusses the main components of a Big Data pipeline and on the other hand it shows how these components are addressed and implemented within a concrete ICT project from education, realized in tight relation with the IT industry.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cavanillas, J.M., Curry, E., Wahlster, W. (eds.): New Horizons for a Data-Driven Economy. A Roadmap for Usage and Exploitation of Big Data in Europe. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-21569-3
Marz, N., Warren, J.: Big Data. Principles and Best Practices of Scalable Realtime Data Systems. Manning, New York (2015)
White, T.: Hadoop: The Definitive Guide, 4th edn. O’Reilly Media Inc., USA (2015)
Sadalage, P.J., Fowler, M.: NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence. Addison-Wesley, Pearson Education, Inc., Boston (2013)
Dong, X.L., Srivastava, D.: Big Data Integration, Synthesis Lectures on Data Management. Morgan & Claypool, San Rafael (2015)
Gartner: Data catalogs are the new black in data management and analytics. https://www.gartner.com. Accessed 22 Mar 2019
Cloud Security Alliance Homepage. https://cloudsecurityalliance.org/articles/csa-big-data-releases-top-10-security-privacy-challenges/. Accessed 22 Mar 2019
Curry, E., Ngonga, A., Domingue, J., Freitas, A., Strohbach, M., Becker, T., et al.: D2.2.2. Final version of the technical white paper. Public deliverable of the EU-Project BIG (318062; ICT-2011.4.4) (2014)
EDSA Homepage. http://edsa-project.eu/. Accessed 22 Mar 2019
LERU (League of European Universities): Excellent education in research rich universities (2017)
XDK Cross Domain Development Kit. https://xdk.bosch-connectivity.com/. Accessed 22 Mar 2019
High Performance Computing System and Private Cloud IBM IntelligentCluster, UBB. http://hpc.cs.ubbcluj.ro/. Accessed 22 Mar 2019
Bufnea D., et al.: Babes-Bolyai University’s High Performance Computing Center. Studia Universitatis Babes-Bolyai, Informatica Series, vol. LXI, no. 2, pp. 54–69 (2016)
BYTE Project Homepage. http://new.byte-project.eu/. Accessed 7 Jan 2019
Employers’ Association of the Software and Services Industry, ANIS Scholarships. https://www.anis.ro/burse-anis. Accessed 22 Mar 2019
Harland, T.: Teaching to enhance research. High. Educ. Res. Dev. 35(3), 461–472 (2016)
STAR-UBB Institute (Institute of Advanced Studies in Science and Technology). http://starubb.institute.ubbcluj.ro/. Accessed 22 Mar 2019
Acknowledgement
The present study was realized within the Advanced Fellowships 2018 offered by the STAR-UBB Institute from the Babes-Bolyai University [17]. The sensors and manufacturing use case were provided by Robert Bosch SRL. Romania.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ciuciu, I., Ene, A.B., Lazar, C. (2019). An ICT Project Case Study from Education: A Technology Review for a Data Engineering Pipeline. In: Abramowicz, W., Corchuelo, R. (eds) Business Information Systems. BIS 2019. Lecture Notes in Business Information Processing, vol 353. Springer, Cham. https://doi.org/10.1007/978-3-030-20485-3_32
Download citation
DOI: https://doi.org/10.1007/978-3-030-20485-3_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20484-6
Online ISBN: 978-3-030-20485-3
eBook Packages: Computer ScienceComputer Science (R0)