Random Sampling Effects on e-Learners Cluster Sizes Using Clustering Algorithms

Al Fanah, Muna

doi:10.1007/978-3-030-52249-0_51

Muna Al Fanah¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1228))

Included in the following conference series:

Science and Information Conference

1118 Accesses

Abstract

e-Learners represent an increasing audience and a target of most higher education institutions (that include universities and colleges). However, few studies focus on their progress and interactions during the learning process through various educational tools. This study looks at the assessment of e-learners for Open University Learning Analytics in the UK. This paper deploys a model-based clustering and The Bayesian Information Criterion for k-means (BIC) to support the classification (e.g. density-based clustering and k-means) and definition of the cluster size. The study found out that, during assessment, students might belong to two to three clusters. The novelty of this study lies in its initial deployment of unsupervised learning (model-based clustering) to discover the number of clusters amongst on-line learners. Secondly, the study tests the validity of clusters by applying two algorithms - k-means and density-estimate clustering - when randomly splitting a training set by 40% and 60%. This form of semi-supervised learning (that meant assigning the number of clusters to k-means algorithms and density estimate clustering) was used to test the effect of two clustering algorithms (e.g. the Density clustering and K-means algorithms) on the cluster size by fixing the sample number and the percentage when splitting the dataset. While an exception, the cluster percentage case only occurs with two clusters and when using a density estimation clustering method. Thus, both algorithms, k-means and density-based clustering using k-means, have a similar effect on the cluster sizes, with the exception of the density-based method when clustering two clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Improving the Quality and Education Systems Through Integration’s Approach of Data Mining Clustering in E-Learning

Clustering Methods Analysis in the E-Learning

Analyzing and Comparing Clustering Algorithms for Student Academic Data

References

Ryan, S., Scott, B., Freeman, H., Patel, D.: The Virtual University: The Internet and Resource-Based Learning. Routledge, Abingdon (2013)
Book Google Scholar
Sin, K., Muthu, L.: Application of big data in education data mining and learning analytics – a literature review. ICTACT J. Soft Comput. 05(04), 1035–1049 (2015)
Article Google Scholar
Chandra, K., Nandhini, E., Chandra, E.: Knowledge mining from student data. Eur. J. Sci. Res. 47(1), 156–163 (2010)
Google Scholar
Campagni, R., Merlini, D., Verri, M.C.: Finding regularities in courses evaluation with K-means clustering. In: Proceedings of the 6th International Conference on Computer Supported Education, CSEDU 2014, vol. 2, pp. 26–33 (2014)
Google Scholar
Conati, C., Gertner, A., Vanlehn, K.: Using Bayesian networks to manage uncertainty in student modeling. User Model. User-Adapted Interact. 12(4), 371–417 (2002)
Article Google Scholar
Kumar, V., Chadha, A.: Mining association rules in student’s assessment data. Int. J. Comput. Sci. Issues 9(5), 211–216 (2012)
Google Scholar
Hanna, M.: Data mining in the e-learning domain. Campus-Wide Inf. Syst. 21(1), 29–34 (2004)
Article Google Scholar
Castro, F., Vellido, A., Nebot, À., Mugica, F.: Applying data mining techniques to e-learning problems. Stud. Comput. Intell. 62, 183–221 (2007)
Google Scholar
Schumacher, C., Ifenthaler, D.: Computers in Human Behavior Features students really expect from learning analytics. Comput. Hum. Behav. 78, 397–407 (2018)
Article Google Scholar
Tempelaar, D., Rienties, B., Mittelmeier, J., Nguyen, Q.: Computers in Human Behavior Student profiling in a dispositional learning analytics application using formative assessment. Comput. Human Behav. 78, 408–420 (2018)
Article Google Scholar
Fernández, A., Peralta, D., Herrera, F., Benítez, J.M.: An overview of e-learning in cloud computing. In: Workshop on Learning Technology for Education in Cloud. AISC, vol. 173, pp. 35–46 (2012)
Google Scholar
Ali, D., Naif, R.A., Rabeeh, A.A., Miltiadis, D.L., Farhat, A., Jalal, S.A.: Predicting student performance using advanced learning analytics. Pakistan Saudi Arabia, vol. C, pp. 415–421 (2017)
Google Scholar
Ferguson, R.: The state of learning analytics in 2012: a review and future challenges. Technical report, KMI-12–01, vol. 4, p. 18, March 2012
Google Scholar
Liñán, L.C., Pérez, Á.A.J.: Educational data mining and learning analytics: differences, similarities, and time evolution. Int. J. Educ. Technol. High. Educ. 12(3), 98–112 (2015)
Google Scholar
Siemens, G., Baker, R.S.J.d.: Learning analytics and educational data mining. In: Proceedings of the 2nd International Conference on Learning Analytics and Knowledge, LAK 2012, p. 252, December 2012
Google Scholar
Baepler, P., Murdoch, C.J.: Academic analytics and data mining in higher education. Int. J. Scholarsh. Teach. Learn. 4(2), 1–9 (2010)
Google Scholar
Desgraupes, B.: Clustering indices, vol. 1, p. 34. University of Paris Ouest-Lab Modal’X (2013)
Google Scholar
Ferguson, R.: Learning analytics: drivers, developments and challenges. Int. J. Technol. Enhanc. Learn. 4(5/6), 304 (2012)
Article Google Scholar
Brohi, S.N., Pillai, T.R., Kaur, S., Kaur, H., Sukumaran, S., Asirvatham, D.: Accuracy comparison of machine learning algorithms for predictive analytics in higher education, pp. 254–261, July 2019
Google Scholar
Kuzilek, J., Hlosta, M., Zdrahal, Z.: Data Descriptor: Open University Learning Analytics Dataset, pp. 1–8 (2017)
Google Scholar
Donner, A.: An empirical study of cluster randomization. Int. J. Epidemiol. 11(3), 283–286 (1982)
Article Google Scholar
Dalton, L.A., Benalcazar, M.E., Dougherty, E.R.: Optimal clustering under uncertainty. PLoS One 13(10), 1–21 (2018)
Article Google Scholar
Fraley, C., Raftery, A.: How many clusters? Which clustering method? Answers via model-based cluster analysis. Comput. J. 41(8), 578–588 (1998)
Article Google Scholar
García-Escudero, L.A., Gordaliza, A., Matrán, C., Mayo-Iscar, A.: A review of robust clustering methods. Adv. Data Anal. Classif. 4(2), 89–109 (2010)
Article MathSciNet Google Scholar
Fop, M., Murphy, T.B.: Variable selection methods for model-based clustering. Stat. Surv. 12, 18–65 (2018)
Article MathSciNet Google Scholar
Zambelli, A.E.: A data-driven approach to estimating the number of clusters in hierarchical clustering. F1000Research 5, 1–13 (2017)
Google Scholar
Mustaniroh, S.A., Effendi, U., Silalahi, R.L.R.: Integration K-means clustering method and elbow method for identification of the best customer profile cluster integration K-means clustering method and elbow method for identification of the best customer profile cluster. In: IOP Conference Series: Materials Science and Engineering, vol. 336, pp. 1–6 (2018)
Google Scholar
Yan, M., Danek, M.S., Place, P.: Determining the number of clusters using the weighted gap statistic, pp. 1031–1037, December 2007
Google Scholar
Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc. Ser. B 62, 411–423 (2001)
Article MathSciNet Google Scholar
Luna-Romera, J.M., del Martínez-Ballesteros, M., García-Gutiérrez, J.C.M., Riquelme-Santos, J.: An approach to silhouette and Dunn clustering indices applied to big data in an approach to silhouette and Dunn clustering indices applied to big data in spark, October 2016
Google Scholar
Bradley, P.S., Fayyad, U.M., Reina, C.A.: Scaling EM (Expectation-Maximization) Clustering to Large Databases (1999)
Google Scholar
Hlosta, M., Kocvara, J., Beran, D., Zdrahal, Z.: Visualisation of key splitting milestones to support interventions. In: Companion Proceedings 9th International Conference on Learning Analytics & Knowledge (LAK19), pp. 1–3 (2019)
Google Scholar
Herodotou, C., Hlosta, M., Boroowa, A., Rienties, B.: Empowering online teachers through predictive learning analytics. Br. J. Educ. Technol. 50(6), 3064–3079 (2019)
Article Google Scholar
Herodotou, C., Rienties, B., Boroowa, A., Zdrahal, Z.: A large ‑ scale implementation of predictive learning analytics in higher education: the teachers’ role and perspective, vol. 67, no. 5. Springer US (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Bradford, Bradford, UK
Muna Al Fanah

Authors

Muna Al Fanah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Muna Al Fanah .

Editor information

Editors and Affiliations

Saga University, Saga, Japan
Kohei Arai
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Supriya Kapoor
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Rahul Bhatia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Al Fanah, M. (2020). Random Sampling Effects on e-Learners Cluster Sizes Using Clustering Algorithms. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Computing. SAI 2020. Advances in Intelligent Systems and Computing, vol 1228. Springer, Cham. https://doi.org/10.1007/978-3-030-52249-0_51

Download citation

DOI: https://doi.org/10.1007/978-3-030-52249-0_51
Published: 04 July 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-52248-3
Online ISBN: 978-3-030-52249-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Random Sampling Effects on e-Learners Cluster Sizes Using Clustering Algorithms

Abstract

Access this chapter

Similar content being viewed by others

Improving the Quality and Education Systems Through Integration’s Approach of Data Mining Clustering in E-Learning

Clustering Methods Analysis in the E-Learning

Analyzing and Comparing Clustering Algorithms for Student Academic Data

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Random Sampling Effects on e-Learners Cluster Sizes Using Clustering Algorithms

Abstract

Access this chapter

Similar content being viewed by others

Improving the Quality and Education Systems Through Integration’s Approach of Data Mining Clustering in E-Learning

Clustering Methods Analysis in the E-Learning

Analyzing and Comparing Clustering Algorithms for Student Academic Data

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation