CMAC: Clustering Class Association Rules to Form a Compact and Meaningful Associative Classifier

Mattiev, Jamolbek; Kavšek, Branko

doi:10.1007/978-3-030-64583-0_34

Jamolbek Mattiev^16,18 &
Branko Kavšek^16,17

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12565))

Included in the following conference series:

International Conference on Machine Learning, Optimization, and Data Science

1579 Accesses
4 Citations

The original version of this chapter was revised: the allocations between authors and affiliations have been corrected. The correction to this chapter is available at https://doi.org/10.1007/978-3-030-64583-0_64

Abstract

Huge amounts of data are being collected and analyzed nowadays. By using the popular rule-learning algorithms, the number of rules discovered on those big datasets can easily exceed thousands of rules. To produce compact and accurate classifiers, such rules have to be grouped and pruned, so that only a reasonable number of them are presented to the end user for inspection and further analysis. To solve this problem researchers have proposed several associative classification approaches that combine two important data mining techniques, namely, classification and association rule mining.

In this paper, we propose a new method that is able to reduce the number of class association rules produced by classical class association rule classifiers, while maintaining an accurate classification model that is comparable to the ones generated by state-of-the-art classification algorithms. More precisely, we propose a new associative classifier – CMAC, that uses agglomerative hierarchical clustering as a post-processing step to reduce the number of its rules.

Experimental results performed on selected datasets from the UCI ML repository show that CMAC is able to learn classifiers containing significantly less rules than state-of-the-art rule learning algorithms on datasets with larger number of examples. On the other hand, classification accuracy of the CMAC classifier is not significantly different from state-of-the-art rule-learners on most of the datasets.

We can thus conclude that CMAC is able to learn compact (and meaningful) classifiers from “bigger” datasets, retaining an accuracy comparable to state-of-the-art rule learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Change history

30 March 2021
The original version of chapter 2 was inadvertently published with wrong RTS values in Table 3: “Results comparison with RTS, S, and SVM^light with standard linear loss with a 10-fold cross validation procedure.”
The RTS values were corrected by replacing the wrong values with the appropriate ones.
The footnote reads “¹Code is available at: https://osf.io/fbzsc/” and has been added to the last sentence in the abstract.
The original version of chapter 34 was inadvertently published with incorrect allocations between authors and affiliations resp. one affiliation was entirely missing.
The affiliations have been corrected and read as follows: ¹University of Primorska, Koper, Slovenia; ²Jožef Stefan Institute, Ljubljana, Slovenia; and ³Urgench State University, Urgench, Uzbekistan. The authors' affiliations are: Jamolbek Mattiev^1,3 and Branko Kavšek^1,2.

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Bocca, J.B., Jarke, M., Zaniolo, C. (eds.) Proceedings of the 20th International Conference on Very Large Data Bases, VLDB 1994, Chile, pp. 487–499 (1994)
Google Scholar
Baralis, E., Cagliero, L., Garza, P.: A novel pattern-based Bayesian classifier. IEEE Trans. Knowl. Data Eng. 25(12), 2780–2795 (2013)
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Cohen, W.W.: Fast effective rule induction. In: Prieditis, A., Russel, S.J. (eds.) Proceedings of the Twelfth International Conference on Machine Learning, ICML 1995, California, pp. 115–123 (1995)
Google Scholar
Dahbi, A., Mouhir, M., Balouki, Y., Gadi, T.: Classification of association rules based on K-means algorithm. In: Mohajir, M.E., Chahhou, M., Achhab, M.A., Mohajir, B.E. (eds.) 4th IEEE International Colloquium on Information Science and Technology, Tangier, Morocco, pp. 300–305 (2016)
Google Scholar
Deng, H., Runger, G., Tuv, E., Bannister, W.: CBC: an associative classifier with a small number of rules. Decis. Support Syst. 50(1), 163–170 (2014)
Article Google Scholar
Dua, D., Graff, C.: UCI Machine Learning Repository. University of California, Irvine (2019)
Google Scholar
Frank, E., Witten, I.: Generating accurate rule sets without global optimization. In: Shavlik, J.W. (eds.) Fifteenth International Conference on Machine Learning, USA, pp. 144–151 (1998)
Google Scholar
Gupta, K.G., Strehl, A., Ghosh, J.: Distance based clustering of association rules. In: Proceedings of Artificial Neural Networks in Engineering Conference, USA, pp. 759–764 (1999)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1) (2009)
Google Scholar
Holte, R.: Very simple classification rules perform well on most commonly used datasets. Mach. Learn. 11(1), 63–91 (1993)
Article Google Scholar
Hu, L.-Y., Hu, Y.-H., Tsai, C.-F., Wang, J.-S., Huang, M.-W.: Building an associative classifier with multiple minimum supports. SpringerPlus 5(1), 1–19 (2016). https://doi.org/10.1186/s40064-016-2153-1
Article Google Scholar
Kohavi, R.: The power of decision tables. In: Lavrač, N., Wrobel, S. (eds.) 8th European Conference on Machine Learning, Crete, Greece, pp. 174–189 (1995)
Google Scholar
Kosters, W.A., Marchiori, E., Oerlemans, A.A.J.: Mining clusters with association rules. In: Hand, D.J., Kok, J.N., Berthold, M.R. (eds.) IDA 1999. LNCS, vol. 1642, pp. 39–50. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48412-4_4
Chapter Google Scholar
Lent, B., Swami, A., Widom, J.: Clustering association rules. In: Gray, A., Larson, P. (eds.) Proceedings of the Thirteenth International Conference on Data Engineering, England, pp. 220–231 (1997)
Google Scholar
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, Hoboken (1990)
Book Google Scholar
Khairan, D.R.: New associative classification method based on rule pruning for classification of datasets. IEEE Access 7, 157783–157795 (2019)
Article Google Scholar
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Agrawal, R., Stolorz, P. (eds.) Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining, New York, USA, pp. 80–86 (1998)
Google Scholar
Mattiev, J., Kavšek, B.: A compact and understandable associative classifier based on overall coverage. Procedia Comput. Sci. 170, 1161–1167 (2020)
Article Google Scholar
Mattiev, J., Kavšek, B.: Simple and accurate classification method based on class association rules performs well on well-known datasets. In: Nicosia, G., Pardalos, P., Umeton, R., Giuffrida, G., Sciacca, V. (eds.) LOD 2019. LNCS, vol. 11943, pp. 192–204. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37599-7_17
Chapter Google Scholar
Zait, M., Messatfa, H.: A comparative study of clustering methods. Future Gener. Comput. Syst. 13(2–3), 149–159 (1997)
Article Google Scholar
Phipps, A., Lawrence, J.H.: An Overview of Combinatorial Data Analysis. Clustering and Classification, pp. 5–63. World Scientific, New Jersey (1996)
Google Scholar
Quinlan, J.: C4.5: programs for machine learning. Mach. Learn. 16(3), pp. 235–240 (1993)
Google Scholar
Ng, T.R., Han, J.: Efficient and effective clustering methods for spatial data mining. In: Bocca, J.B., Jarke, M., Zaniolo, C. (eds.) Proceedings of the 20th Conference on Very Large Data Bases (VLDB), Santiago, Chile, pp. 144–155 (1994)
Google Scholar
Theodoridis, S., Koutroumbas, K.: Hierarchical algorithms. Pattern Recogn. 4(13), 653–700 (2009)
Google Scholar
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Widom, J. (eds.) Proceedings of the 1996 ACM-SIGMOD International Conference on Management of Data, Montreal, Canada, pp. 103–114 (1996)
Google Scholar

Download references

Acknowledgement

The authors gratefully acknowledge the European Commission for funding the InnoRenew CoE project (Grant Agreement #739574) under the Horizon2020 Widespread-Teaming program and the Republic of Slovenia (Investment funding of the Republic of Slovenia and the European Union of the European Regional Development Fund). Jamolbek Mattiev is also funded for his Ph.D. by the “El-Yurt-Umidi” foundation under the Cabinet of Ministers of the Republic of Uzbekistan.

Author information

Authors and Affiliations

University of Primorska, Koper, Slovenia
Jamolbek Mattiev & Branko Kavšek
Jožef Stefan Institute, Jamova cesta 39, 1000, Ljubljana, Slovenia
Branko Kavšek
Urgench State University, Urgench, Uzbekistan
Jamolbek Mattiev

Authors

Jamolbek Mattiev
View author publications
You can also search for this author in PubMed Google Scholar
Branko Kavšek
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Branko Kavšek .

Editor information

Editors and Affiliations

University of Catania, Catania, Italy
Giuseppe Nicosia
University of Reading, Reading, UK
Varun Ojha
University of Oxford, Oxford, UK
Emanuele La Malfa
University of Cambridge, Cambridge, UK
Giorgio Jansen
Almawave, Rome, Italy
Vincenzo Sciacca
University of Florida, Gainesville, FL, USA
Panos Pardalos
University of Catania, Catania, Italy
Giovanni Giuffrida
Harvard University, Cambridge, MA, USA
Renato Umeton

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mattiev, J., Kavšek, B. (2020). CMAC: Clustering Class Association Rules to Form a Compact and Meaningful Associative Classifier. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2020. Lecture Notes in Computer Science(), vol 12565. Springer, Cham. https://doi.org/10.1007/978-3-030-64583-0_34

Download citation

DOI: https://doi.org/10.1007/978-3-030-64583-0_34
Published: 08 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64582-3
Online ISBN: 978-3-030-64583-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics