Skip to main content

CMAC: Clustering Class Association Rules to Form a Compact and Meaningful Associative Classifier

  • Conference paper
  • First Online:
Machine Learning, Optimization, and Data Science (LOD 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12565))

Abstract

Huge amounts of data are being collected and analyzed nowadays. By using the popular rule-learning algorithms, the number of rules discovered on those big datasets can easily exceed thousands of rules. To produce compact and accurate classifiers, such rules have to be grouped and pruned, so that only a reasonable number of them are presented to the end user for inspection and further analysis. To solve this problem researchers have proposed several associative classification approaches that combine two important data mining techniques, namely, classification and association rule mining.

In this paper, we propose a new method that is able to reduce the number of class association rules produced by classical class association rule classifiers, while maintaining an accurate classification model that is comparable to the ones generated by state-of-the-art classification algorithms. More precisely, we propose a new associative classifier – CMAC, that uses agglomerative hierarchical clustering as a post-processing step to reduce the number of its rules.

Experimental results performed on selected datasets from the UCI ML repository show that CMAC is able to learn classifiers containing significantly less rules than state-of-the-art rule learning algorithms on datasets with larger number of examples. On the other hand, classification accuracy of the CMAC classifier is not significantly different from state-of-the-art rule-learners on most of the datasets.

We can thus conclude that CMAC is able to learn compact (and meaningful) classifiers from “bigger” datasets, retaining an accuracy comparable to state-of-the-art rule learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Change history

  • 30 March 2021

    The original version of chapter 2 was inadvertently published with wrong RTS values in Table 3: “Results comparison with RTS, S, and SVMlight with standard linear loss with a 10-fold cross validation procedure.”

    The RTS values were corrected by replacing the wrong values with the appropriate ones.

    The footnote reads “1Code is available at: https://osf.io/fbzsc/” and has been added to the last sentence in the abstract.

    The original version of chapter 34 was inadvertently published with incorrect allocations between authors and affiliations resp. one affiliation was entirely missing.

    The affiliations have been corrected and read as follows: 1University of Primorska, Koper, Slovenia; 2Jožef Stefan Institute, Ljubljana, Slovenia; and 3Urgench State University, Urgench, Uzbekistan. The authors' affiliations are: Jamolbek Mattiev1,3 and Branko Kavšek1,2.

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Bocca, J.B., Jarke, M., Zaniolo, C. (eds.) Proceedings of the 20th International Conference on Very Large Data Bases, VLDB 1994, Chile, pp. 487–499 (1994)

    Google Scholar 

  2. Baralis, E., Cagliero, L., Garza, P.: A novel pattern-based Bayesian classifier. IEEE Trans. Knowl. Data Eng. 25(12), 2780–2795 (2013)

    Article  Google Scholar 

  3. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  4. Cohen, W.W.: Fast effective rule induction. In: Prieditis, A., Russel, S.J. (eds.) Proceedings of the Twelfth International Conference on Machine Learning, ICML 1995, California, pp. 115–123 (1995)

    Google Scholar 

  5. Dahbi, A., Mouhir, M., Balouki, Y., Gadi, T.: Classification of association rules based on K-means algorithm. In: Mohajir, M.E., Chahhou, M., Achhab, M.A., Mohajir, B.E. (eds.) 4th IEEE International Colloquium on Information Science and Technology, Tangier, Morocco, pp. 300–305 (2016)

    Google Scholar 

  6. Deng, H., Runger, G., Tuv, E., Bannister, W.: CBC: an associative classifier with a small number of rules. Decis. Support Syst. 50(1), 163–170 (2014)

    Article  Google Scholar 

  7. Dua, D., Graff, C.: UCI Machine Learning Repository. University of California, Irvine (2019)

    Google Scholar 

  8. Frank, E., Witten, I.: Generating accurate rule sets without global optimization. In: Shavlik, J.W. (eds.) Fifteenth International Conference on Machine Learning, USA, pp. 144–151 (1998)

    Google Scholar 

  9. Gupta, K.G., Strehl, A., Ghosh, J.: Distance based clustering of association rules. In: Proceedings of Artificial Neural Networks in Engineering Conference, USA, pp. 759–764 (1999)

    Google Scholar 

  10. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1) (2009)

    Google Scholar 

  11. Holte, R.: Very simple classification rules perform well on most commonly used datasets. Mach. Learn. 11(1), 63–91 (1993)

    Article  Google Scholar 

  12. Hu, L.-Y., Hu, Y.-H., Tsai, C.-F., Wang, J.-S., Huang, M.-W.: Building an associative classifier with multiple minimum supports. SpringerPlus 5(1), 1–19 (2016). https://doi.org/10.1186/s40064-016-2153-1

    Article  Google Scholar 

  13. Kohavi, R.: The power of decision tables. In: Lavrač, N., Wrobel, S. (eds.) 8th European Conference on Machine Learning, Crete, Greece, pp. 174–189 (1995)

    Google Scholar 

  14. Kosters, W.A., Marchiori, E., Oerlemans, A.A.J.: Mining clusters with association rules. In: Hand, D.J., Kok, J.N., Berthold, M.R. (eds.) IDA 1999. LNCS, vol. 1642, pp. 39–50. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48412-4_4

    Chapter  Google Scholar 

  15. Lent, B., Swami, A., Widom, J.: Clustering association rules. In: Gray, A., Larson, P. (eds.) Proceedings of the Thirteenth International Conference on Data Engineering, England, pp. 220–231 (1997)

    Google Scholar 

  16. Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, Hoboken (1990)

    Book  Google Scholar 

  17. Khairan, D.R.: New associative classification method based on rule pruning for classification of datasets. IEEE Access 7, 157783–157795 (2019)

    Article  Google Scholar 

  18. Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Agrawal, R., Stolorz, P. (eds.) Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining, New York, USA, pp. 80–86 (1998)

    Google Scholar 

  19. Mattiev, J., Kavšek, B.: A compact and understandable associative classifier based on overall coverage. Procedia Comput. Sci. 170, 1161–1167 (2020)

    Article  Google Scholar 

  20. Mattiev, J., Kavšek, B.: Simple and accurate classification method based on class association rules performs well on well-known datasets. In: Nicosia, G., Pardalos, P., Umeton, R., Giuffrida, G., Sciacca, V. (eds.) LOD 2019. LNCS, vol. 11943, pp. 192–204. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37599-7_17

    Chapter  Google Scholar 

  21. Zait, M., Messatfa, H.: A comparative study of clustering methods. Future Gener. Comput. Syst. 13(2–3), 149–159 (1997)

    Article  Google Scholar 

  22. Phipps, A., Lawrence, J.H.: An Overview of Combinatorial Data Analysis. Clustering and Classification, pp. 5–63. World Scientific, New Jersey (1996)

    Google Scholar 

  23. Quinlan, J.: C4.5: programs for machine learning. Mach. Learn. 16(3), pp. 235–240 (1993)

    Google Scholar 

  24. Ng, T.R., Han, J.: Efficient and effective clustering methods for spatial data mining. In: Bocca, J.B., Jarke, M., Zaniolo, C. (eds.) Proceedings of the 20th Conference on Very Large Data Bases (VLDB), Santiago, Chile, pp. 144–155 (1994)

    Google Scholar 

  25. Theodoridis, S., Koutroumbas, K.: Hierarchical algorithms. Pattern Recogn. 4(13), 653–700 (2009)

    Google Scholar 

  26. Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Widom, J. (eds.) Proceedings of the 1996 ACM-SIGMOD International Conference on Management of Data, Montreal, Canada, pp. 103–114 (1996)

    Google Scholar 

Download references

Acknowledgement

The authors gratefully acknowledge the European Commission for funding the InnoRenew CoE project (Grant Agreement #739574) under the Horizon2020 Widespread-Teaming program and the Republic of Slovenia (Investment funding of the Republic of Slovenia and the European Union of the European Regional Development Fund). Jamolbek Mattiev is also funded for his Ph.D. by the “El-Yurt-Umidi” foundation under the Cabinet of Ministers of the Republic of Uzbekistan.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Branko Kavšek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mattiev, J., Kavšek, B. (2020). CMAC: Clustering Class Association Rules to Form a Compact and Meaningful Associative Classifier. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2020. Lecture Notes in Computer Science(), vol 12565. Springer, Cham. https://doi.org/10.1007/978-3-030-64583-0_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-64583-0_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-64582-3

  • Online ISBN: 978-3-030-64583-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics