Abstract
e-Learners represent an increasing audience and a target of most higher education institutions (that include universities and colleges). However, few studies focus on their progress and interactions during the learning process through various educational tools. This study looks at the assessment of e-learners for Open University Learning Analytics in the UK. This paper deploys a model-based clustering and The Bayesian Information Criterion for k-means (BIC) to support the classification (e.g. density-based clustering and k-means) and definition of the cluster size. The study found out that, during assessment, students might belong to two to three clusters. The novelty of this study lies in its initial deployment of unsupervised learning (model-based clustering) to discover the number of clusters amongst on-line learners. Secondly, the study tests the validity of clusters by applying two algorithms - k-means and density-estimate clustering - when randomly splitting a training set by 40% and 60%. This form of semi-supervised learning (that meant assigning the number of clusters to k-means algorithms and density estimate clustering) was used to test the effect of two clustering algorithms (e.g. the Density clustering and K-means algorithms) on the cluster size by fixing the sample number and the percentage when splitting the dataset. While an exception, the cluster percentage case only occurs with two clusters and when using a density estimation clustering method. Thus, both algorithms, k-means and density-based clustering using k-means, have a similar effect on the cluster sizes, with the exception of the density-based method when clustering two clusters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ryan, S., Scott, B., Freeman, H., Patel, D.: The Virtual University: The Internet and Resource-Based Learning. Routledge, Abingdon (2013)
Sin, K., Muthu, L.: Application of big data in education data mining and learning analytics – a literature review. ICTACT J. Soft Comput. 05(04), 1035–1049 (2015)
Chandra, K., Nandhini, E., Chandra, E.: Knowledge mining from student data. Eur. J. Sci. Res. 47(1), 156–163 (2010)
Campagni, R., Merlini, D., Verri, M.C.: Finding regularities in courses evaluation with K-means clustering. In: Proceedings of the 6th International Conference on Computer Supported Education, CSEDU 2014, vol. 2, pp. 26–33 (2014)
Conati, C., Gertner, A., Vanlehn, K.: Using Bayesian networks to manage uncertainty in student modeling. User Model. User-Adapted Interact. 12(4), 371–417 (2002)
Kumar, V., Chadha, A.: Mining association rules in student’s assessment data. Int. J. Comput. Sci. Issues 9(5), 211–216 (2012)
Hanna, M.: Data mining in the e-learning domain. Campus-Wide Inf. Syst. 21(1), 29–34 (2004)
Castro, F., Vellido, A., Nebot, À., Mugica, F.: Applying data mining techniques to e-learning problems. Stud. Comput. Intell. 62, 183–221 (2007)
Schumacher, C., Ifenthaler, D.: Computers in Human Behavior Features students really expect from learning analytics. Comput. Hum. Behav. 78, 397–407 (2018)
Tempelaar, D., Rienties, B., Mittelmeier, J., Nguyen, Q.: Computers in Human Behavior Student profiling in a dispositional learning analytics application using formative assessment. Comput. Human Behav. 78, 408–420 (2018)
Fernández, A., Peralta, D., Herrera, F., Benítez, J.M.: An overview of e-learning in cloud computing. In: Workshop on Learning Technology for Education in Cloud. AISC, vol. 173, pp. 35–46 (2012)
Ali, D., Naif, R.A., Rabeeh, A.A., Miltiadis, D.L., Farhat, A., Jalal, S.A.: Predicting student performance using advanced learning analytics. Pakistan Saudi Arabia, vol. C, pp. 415–421 (2017)
Ferguson, R.: The state of learning analytics in 2012: a review and future challenges. Technical report, KMI-12–01, vol. 4, p. 18, March 2012
Liñán, L.C., Pérez, Á.A.J.: Educational data mining and learning analytics: differences, similarities, and time evolution. Int. J. Educ. Technol. High. Educ. 12(3), 98–112 (2015)
Siemens, G., Baker, R.S.J.d.: Learning analytics and educational data mining. In: Proceedings of the 2nd International Conference on Learning Analytics and Knowledge, LAK 2012, p. 252, December 2012
Baepler, P., Murdoch, C.J.: Academic analytics and data mining in higher education. Int. J. Scholarsh. Teach. Learn. 4(2), 1–9 (2010)
Desgraupes, B.: Clustering indices, vol. 1, p. 34. University of Paris Ouest-Lab Modal’X (2013)
Ferguson, R.: Learning analytics: drivers, developments and challenges. Int. J. Technol. Enhanc. Learn. 4(5/6), 304 (2012)
Brohi, S.N., Pillai, T.R., Kaur, S., Kaur, H., Sukumaran, S., Asirvatham, D.: Accuracy comparison of machine learning algorithms for predictive analytics in higher education, pp. 254–261, July 2019
Kuzilek, J., Hlosta, M., Zdrahal, Z.: Data Descriptor: Open University Learning Analytics Dataset, pp. 1–8 (2017)
Donner, A.: An empirical study of cluster randomization. Int. J. Epidemiol. 11(3), 283–286 (1982)
Dalton, L.A., Benalcazar, M.E., Dougherty, E.R.: Optimal clustering under uncertainty. PLoS One 13(10), 1–21 (2018)
Fraley, C., Raftery, A.: How many clusters? Which clustering method? Answers via model-based cluster analysis. Comput. J. 41(8), 578–588 (1998)
García-Escudero, L.A., Gordaliza, A., Matrán, C., Mayo-Iscar, A.: A review of robust clustering methods. Adv. Data Anal. Classif. 4(2), 89–109 (2010)
Fop, M., Murphy, T.B.: Variable selection methods for model-based clustering. Stat. Surv. 12, 18–65 (2018)
Zambelli, A.E.: A data-driven approach to estimating the number of clusters in hierarchical clustering. F1000Research 5, 1–13 (2017)
Mustaniroh, S.A., Effendi, U., Silalahi, R.L.R.: Integration K-means clustering method and elbow method for identification of the best customer profile cluster integration K-means clustering method and elbow method for identification of the best customer profile cluster. In: IOP Conference Series: Materials Science and Engineering, vol. 336, pp. 1–6 (2018)
Yan, M., Danek, M.S., Place, P.: Determining the number of clusters using the weighted gap statistic, pp. 1031–1037, December 2007
Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc. Ser. B 62, 411–423 (2001)
Luna-Romera, J.M., del Martínez-Ballesteros, M., García-Gutiérrez, J.C.M., Riquelme-Santos, J.: An approach to silhouette and Dunn clustering indices applied to big data in an approach to silhouette and Dunn clustering indices applied to big data in spark, October 2016
Bradley, P.S., Fayyad, U.M., Reina, C.A.: Scaling EM (Expectation-Maximization) Clustering to Large Databases (1999)
Hlosta, M., Kocvara, J., Beran, D., Zdrahal, Z.: Visualisation of key splitting milestones to support interventions. In: Companion Proceedings 9th International Conference on Learning Analytics & Knowledge (LAK19), pp. 1–3 (2019)
Herodotou, C., Hlosta, M., Boroowa, A., Rienties, B.: Empowering online teachers through predictive learning analytics. Br. J. Educ. Technol. 50(6), 3064–3079 (2019)
Herodotou, C., Rienties, B., Boroowa, A., Zdrahal, Z.: A large ‑ scale implementation of predictive learning analytics in higher education: the teachers’ role and perspective, vol. 67, no. 5. Springer US (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Al Fanah, M. (2020). Random Sampling Effects on e-Learners Cluster Sizes Using Clustering Algorithms. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Computing. SAI 2020. Advances in Intelligent Systems and Computing, vol 1228. Springer, Cham. https://doi.org/10.1007/978-3-030-52249-0_51
Download citation
DOI: https://doi.org/10.1007/978-3-030-52249-0_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-52248-3
Online ISBN: 978-3-030-52249-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)