An exploratory teaching program in big data analysis for undergraduate students

Eken, Süleyman

doi:10.1007/s12652-020-02447-4

An exploratory teaching program in big data analysis for undergraduate students

Original Research
Published: 14 August 2020

Volume 11, pages 4285–4304, (2020)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Süleyman Eken ORCID: orcid.org/0000-0001-9488-908X¹

772 Accesses
18 Citations
Explore all metrics

Abstract

Many of the world’s biggest discoveries and decisions in science, technology, business, medicine, politics, and society as a whole, are now being made on the basis of analyzing massive datasets. In this paper, exploratory teaching program is proposed. It provides a broad and practical introduction to big data analysis. This exploratory teaching program was designed and given in Department of Computer Engineering at Kocaeli University in the spring semester of 2018–2019. To assess the educational program’s impact on the learning process and to evaluate the acceptance and satisfaction level of students, they answered a questionnaire after finishing the program. According to students’ feedback, the exploratory teaching program is useful for learning how to analyze large datasets and identify patterns that will improve any company’s and organization decision-making process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reform of Teaching Mode in Universities Based on Big Data

Empowering students through active learning in educational big data analytics

Article Open access 01 April 2024

A Multifaceted Approach Towards Education in Data Analytics

Notes

Apache Hadoop (2011) http://hadoop.apache.org, Accessed 3 Jan 2020.
Jupyter notebooks (2011) www.jupyter.org, Accessed 3 Jan 2020.
Anaconda (2011) https://www.anaconda.com/, Accessed 3 Jan 2020.
Countries of the World dataset (2018) https://www.kaggle.com/fernandol/countries-of-the-world, Accessed 5 Jan 2020.
Gartner (2020) https://www.gartner.com/en, Accessed 5 Jan 2020.
Gartner BI report (2020) https://www.gartner.com/reviews/market/analytics-business-intelligence-platforms, Accessed 5 Jan 2020.
Black Friday dataset (2016) https://datahack.analyticsvidhya.com/contest/black-friday/, Accessed 5 Jan 2020.
Heart Disease UCI dataset (2018) https://www.kaggle.com/ronitf/heart-disease-uci, Accessed 5 Jan 2020.
House Prices dataset (2017) https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data, Accessed 5 Jan 2020.
World University Rankings dataset (2019) https://www.kaggle.com/mylesoneill/world-university-rankings, Accessed 5 Jan 2020.
Sensorless Drive Diagnosis dataset (2019) https://archive.ics.uci.edu/ml/datasets/Dataset+for+Sensorless+Drive+Diagnosis, Accessed 5 Jan 2020.
Football World Cup 2018 dataset (2018) https://www.kaggle.com/sawya34/football-world-cup-2018-dataset, Accessed 5 Jan 2020.
Hadoop Design Patterns (2012) https://github.com/adamjshook/mapreducepatterns, Accessed 6 Jan 2020.
Apache Zeppelin (2015) https://zeppelin.apache.org/, Accessed 6 Jan 2020.
Company Acquisitions dataset (2018) https://www.kaggle.com/shivamb/company-acquisitions-7-top-companies, Accessed 6 Jan 2020.

References

Aggarwal AK (2019) Opportunities and challenges of big data in public sector. In: Web services: concepts, methodologies, tools, and applications. IGI Global, pp 1749–1761
Batra R (2018) SQL primer: an accelerated introduction to SQL basics. Apress, New York
Book Google Scholar
Bikakis N (2018) Big data visualization tools. In: arXiv:1801.08336
Bloom BS et al (1956) Taxonomy of educational objectives. Cognitive domain, vol 1. McKay, New York, pp 20–24
Google Scholar
Cattell R (2011) Scalable SQL and NoSQL data stores. Acm Sigmod Record 39(4):12–27
Article Google Scholar
Chintapalli S et al (2016) Benchmarking streaming computation engines: storm, flink and spark streaming. In: 2016 IEEE international parallel and distributed processing symposium workshops (IPDPSW). IEEE, pp 1789–1792
Cuttone A, Sune L, Jakob EL (2016) geoplotlib: a python toolbox for visualizing geographical data. In: arXiv preprint arXiv:1608.01933
Der Walt SV, Colbert SC, Varoquaux G (2011) The NumPy array: a structure for efficient numerical computation. Comput Sci Eng 13(2):22
Article Google Scholar
Doug H (2013) Big data analytics masters programs: 20 top programs. www.informationweek.com/big-data/slideshows/big-data-analytics/big-data-analytics-masters-degrees-20/240145673?pgno=1. Accessed 25 Jan 2020
Eken S (2019) Introduction to big data analysis course material. https://piazza.com/kocaeli_university/spring2019/blm442/resources. Accessed 25 Jan 2020
Embarak O (2018) Data visualization. Data analysis and visualization using Python. Springer, New York, pp 293–342
Chapter Google Scholar
Even S (2011) Graph algorithms. Cambridge University Press, Cambridge
Book Google Scholar
Fan W, Gordon MD (2014) The power of social media analytics. Commun Acm 57(6):74–81
Article Google Scholar
Feigelson ED, Jogesh Babu G (2012) Big data in astronomy. Significance 9(4):22–25
Article Google Scholar
Hashem IAT et al (2015) The rise of big data on cloud computing: review and open research issues. Inf Syst 47:98–115
Article Google Scholar
Hunter JD (2007) Matplotlib: a 2D graphics environment. Comput Sci Eng 9(3):90
Article Google Scholar
Karau H et al (2015) Learning spark: lightning-fast big data analysis. O’Reilly Media Inc, Sebastopol
Google Scholar
Kluyver T et al. (2016) Jupyter Notebooks—a publishing format for reproducible computational work flows. In: ELPUB, pp 87–90
Lelouche R (2005) Exploratory and experimental learning? For teachers and researchers too! In: CELDA: conference on cognition and exploratory learning in digital age. IADIS: international association for development of information society, pp 167–174
Mahmood T, Uzma A (2013) Security analytics: big data analytics for cybersecurity: a review of trends, techniques and tools. In: 2013 2nd national conference on information assurance (NCIA). IEEE, pp 129–134
McAfee A et al (2012) Big data: the management revolution. Harvard Bus Rev 90(10):60–68
Google Scholar
McKinney W (2011) pandas: a foundational Python library for data analysis and statistics. Python High Perform Sci Comput 14(9):1–9
Google Scholar
McKinney W (2012) Python for data analysis: data wrangling with Pandas, NumPy, and IPython. O’Reilly Media Inc, Sebastopol
Google Scholar
Meng X et al (2016) Mllib: machine learning in apache spark. J Mach Learn Res 17(1):1235–1241
MathSciNet MATH Google Scholar
Miller JJ (2013) Graph database applications and concepts with Neo4j. In: Proceedings of the Southern Association for information systems conference, Atlanta, GA, USA, vol 2324, p S36
Miner D, Shook A (2012) MapReduce design patterns: building effective algorithms and analytics for Hadoop and other systems. O’Reilly Media Inc, Sebastopol
Google Scholar
Murray DG (2013) Tableau your data!: fast and easy visual analysis with tableau software. Wiley, New York
Google Scholar
Oussous A et al (2018) Big Data technologies: a survey. J King Saud Univ Comput Inf Sci 30(4):431–448
Google Scholar
Pedregosa F et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12 Oct:2825–2830
MathSciNet MATH Google Scholar
Shannon K (2013) Data science programs on the increase at universities. www.dataversity.net/data-science-programs-on-the-increase-at-universities/. Accessed 25 Jan 2020
Shen H (2014) Interactive notebooks: sharing the code. Nat News 515(7525):151
Article Google Scholar
Sigman BP et al (2014) Teaching big data: experiences, lessons learned, and future directions. Decis Line 45(1):10–15
Google Scholar
Staff DSD (2019) 20 Best data science bachelors degree programs 2019. https://www.datasciencedegreeprograms.net/rankings/data-science-bachelors/. Accessed 25 Jan 2020
Van Der Aalst W (2016) Data science in action. Process mining. Springer, New York, pp 3–23
Chapter Google Scholar
Will M et al (2017) The Quant Crunch: how the demand for data science skills is disrupting the job market. https://www.ibm.com/downloads/cas/3RL3VXGA. Accessed 25 Jan 2020
Xin RS et al (2013) Shark: SQL and rich analytics at scale. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data. ACM, pp. 13–24
Yates RD, Goodman DJ (2014) Probability and stochastic processes: a friendly introduction for electrical and computer engineers. Wiley, New York
MATH Google Scholar
Zaharia M et al (2016) Apache spark: a unified engine for big data processing. Commun ACM 59(11):56–65
Article Google Scholar
Zudilova-Seinstra E, Adriaansen T, Van Liere R (2009) Overview of interactive visualisation. Trends in interactive visualization. Springer, New York, pp 3–15
Chapter Google Scholar

Download references

Acknowledgements

I would like to thank GOSB Technology Manager Engin Işık for his support in the survey conducted with big data sector companies.

Author information

Authors and Affiliations

Department of Information Systems Engineering, Kocaeli University, 41001, Kocaeli, Turkey
Süleyman Eken

Authors

Süleyman Eken
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Süleyman Eken.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Eken, S. An exploratory teaching program in big data analysis for undergraduate students. J Ambient Intell Human Comput 11, 4285–4304 (2020). https://doi.org/10.1007/s12652-020-02447-4

Download citation

Received: 25 January 2020
Accepted: 30 July 2020
Published: 14 August 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s12652-020-02447-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An exploratory teaching program in big data analysis for undergraduate students

Abstract

Access this article

Similar content being viewed by others

Reform of Teaching Mode in Universities Based on Big Data

Empowering students through active learning in educational big data analytics

A Multifaceted Approach Towards Education in Data Analytics

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An exploratory teaching program in big data analysis for undergraduate students

Abstract

Access this article

Similar content being viewed by others

Reform of Teaching Mode in Universities Based on Big Data

Empowering students through active learning in educational big data analytics

A Multifaceted Approach Towards Education in Data Analytics

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation