Skip to main content

Classification, Clustering and Association Rule Mining in Educational Datasets Using Data Mining Tools: A Case Study

  • Conference paper
  • First Online:
Cybernetics and Algorithms in Intelligent Systems (CSOC2018 2018)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 765))

Included in the following conference series:

Abstract

Educational Data Mining is an emerging field in the data mining domain. In this competitive world scenario, the quality of education needs to improve. Unfortunately most of the students’ data are becoming data tombs for not analyzing the hidden knowledge. The educational data mining tries to uncover the hidden knowledge by discovering relationships between student learning characteristics and behavior. With this educational data modeling, the educators may plan for future learning pedagogy to support the student’s learning style. This knowledge may be applied by the academic planners to improve the quality of education and decrease the failure rate. In this paper, we had collected real dataset containing 666 instances with 11 attributes. The data is from the Common Entrance Examination (CEE) data of a particular year for admission to medical colleges of Assam, India conducted by Dibrugarh University. We tried to find out the association rules using the data. Various clustering and classification methods were also used to compare the suitable one for the dataset. The data mining tools applied in the educational data were Orange, Weka and R Studio.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bhardwaj, B.K., Pal, S.: Data mining: a prediction for performance improvement using classification. Int. J. Comput. Sci. Inf. Secur. (IJCSIS) 9(4), 136–140 (2012)

    Google Scholar 

  2. Yadav, S.K., Pal, S.: Data mining: a prediction for performance improvement of engineering students using classification. World Comput. Sci. Inf. Technol. J. 2(2), 51–56 (2012)

    Google Scholar 

  3. Kukasvadiya, M.S., Divecha, N.H.: Analysis of Data Using Data Mining tool Orange. Int. J. Eng. Develop. Res. 5(2), 1836–1840 (2017)

    Google Scholar 

  4. DeFreitas, K., Bernard, M.: Comparative performance analysis of clustering techniques in educational data mining. IADIS Int. J. Comput. Sci. Inf. Syst. 10(2), 65–78 (2015)

    Google Scholar 

  5. Dutt, A., Aghabozrgi, S., Ismail, M.A.B., Mahroein, H.: Clustering algorithms applied in educational data mining. Int. J. Inf. Electron. Eng. 5(2), 112–116 (2015)

    Google Scholar 

  6. Nagy, H.M., Aly, W.M., Hegazy, O.F.: An educational data mining system for advising higher education students. Int. J. Comput. Inf. Eng. 7(10), 1226–1270 (2013)

    Google Scholar 

  7. Oyelade, O.J., Oladipupo, O.O., Obagbuwa, I.C.: Application of K-means clustering algorithm for prediction of students’ academic performance. Int. J. Comput. Sci. Inf. Secur. 7(1), 292–295 (2010)

    Google Scholar 

  8. Almarabeh, H.: Analysis of students’ performance by using different data mining classifiers. Int. J. Mod. Educ. Comput. Sci. 9(8), 9–15 (2017)

    Article  Google Scholar 

  9. Sivogolovko, E., Novikov, B.: Validating cluster structures in data mining tasks. In: Proceedings of the 2012 Joint EDBT/ICDT Workshops on - EDBT-ICDT 2012, p. 245. ACM, New York (2012)

    Google Scholar 

  10. Everitt, B.: Cluster Analysis. Wiley, Chichester (2011). ISBN 9780470749913

    Book  MATH  Google Scholar 

  11. Park, H.S., Jun, C.H.: A simple and fast algorithm for K-medoids clustering. Exp. Syst. Appl. 36(2), 3336–3341 (2009)

    Article  Google Scholar 

  12. Maulik, U., Bandyopadhyay, S.: Performance evaluation of some clustering algorithms and validity indices. IEEE Trans. Patt. Anal. Mach. Intel. 24(12), 1650–1654 (2002)

    Article  Google Scholar 

  13. Berkhin, P.P.: A Survey of Clustering Data Mining Techniques. Springer, Heidelberg (2006)

    Book  Google Scholar 

  14. Chang, W.L., Pang, L.M., Tay, K.M.: Application of self-organizing map to failure modes and effects analysis methodology. Neurocomputing (2017). https://doi.org/10.1016/j.neucom.2016.04.073

  15. Ahmed, A.B.E.D., Elaraby, I.S.: Data mining: a prediction for student’s performance using classification method. World J. Comput. Appl. Technol. 2(2), 43–47 (2014)

    Google Scholar 

  16. Pandey, U.K., Pal, S.: Data mining: a prediction of performer or underperformer using classification. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 2(2), 686–690 (2011)

    Google Scholar 

  17. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley-Interscience Publication, New York (2000)

    MATH  Google Scholar 

  18. Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Mach. Learn. 29, 103–130 (1997)

    Article  MATH  Google Scholar 

  19. Jiawei, H., Micheline, K.: Data Mining: Concepts and Techniques. Elsevier Book Series (2000)

    Google Scholar 

  20. Rakesh, A., Ramakrishnan, S.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pp. 487–499 (1994)

    Google Scholar 

  21. Willmott, C.J., Matsuura, K.: Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim. Res. 30, 79–82 (2005)

    Article  Google Scholar 

  22. Powers, D.M.W.: Evaluation: from precision, recall and f-measure to roc., informedness, markedness & correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)

    MathSciNet  Google Scholar 

  23. de Amorim, R.C., Hennig, C.: Recovering the number of clusters in data sets with noise features using feature rescaling factors. Inf. Sci. 324, 126–145 (2015). https://doi.org/10.1016/j.ins.2015.06.039

    Article  MathSciNet  Google Scholar 

  24. Borg, I., Groenen, P.: Modern Multidimensional Scaling: Theory and Applications, pp. 207–212, 2nd edn. Springer, New York (2005). ISBN 0-387-94845-7

    Google Scholar 

  25. Demšar, J., Curk, T., Erjavec, A., Gorup, Č., Hočevar, T., Milutinovič, M., Možina, M., Polajnar, M., Toplak, M., Starič, A., Stajdohar, M., Umek, L., Žagar, L., Žbontar, J., Žitnik, M., Zupan, B.: Orange: data mining toolbox in Python. JMLR. 14(1), 2349–2353 (2013)

    MATH  Google Scholar 

  26. Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann, San Francisco (2011)

    Google Scholar 

  27. Verzani, J.: Getting Started with RStudio, p. 4. O’Reilly Media, Inc. (2011). ISBN 9781449309039

    Google Scholar 

  28. Sharma, A., Dey, S.: Performance investigation of feature selection methods and sentiment lexicons for sentiment analysis. IJCA 3, 15–20 (2012). Special Issue on Advanced Computing and Communication Technologies for HPC Applications ACCTHPCA

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sadiq Hussain .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hussain, S., Atallah, R., Kamsin, A., Hazarika, J. (2019). Classification, Clustering and Association Rule Mining in Educational Datasets Using Data Mining Tools: A Case Study. In: Silhavy, R. (eds) Cybernetics and Algorithms in Intelligent Systems . CSOC2018 2018. Advances in Intelligent Systems and Computing, vol 765. Springer, Cham. https://doi.org/10.1007/978-3-319-91192-2_21

Download citation

Publish with us

Policies and ethics