ABSTRACT
Quantitative associative classification (QAC) is favored for its strong explainability and satisfying predictive ability. However, existing QAC methods can't directly classify quantitative dataset basically, instead they work by partitioning quantitative data into a number of categories and then classify them, which cause information loss. Here, we propose a new QAC method which can directly estimate the probability of association rules using kernel mean embedding technology so that it can avoid the biases induced by partitioning and obtain competitive classification accuracy. Specifically, we implement an Apriori-like association rule discovery process and estimate the posterior probability of data by Bayes' rule with the terms being computed in a kernel mean embedding pattern. To verify our method, we take experiments on several data sets, and the results demonstrate that the method performs better than SVM, decision tree and partitioning Apriori algorithm. We also test the effect of the method's super-parameters including minimal support, minimal confidence and maximal attribute set size, and give an example of quantitative association rule set.
- Johan Huysmans, Karel Dejaeger, Christophe Mues, Jan Vanthienen, and Bart Baesens. 2011. An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decision Support Systems, 51(1): 141-154. https://doi.org/10.1016/j.dss.2010.12.003Google ScholarDigital Library
- Suzan Wedyan. 2014. Review and comparison of associative classification data mining approaches. International Journal of Computer, Information, Systems and Control Engineering, 8(1): 34-45. https://DOI/10.5281/zenodo.1336440Google Scholar
- Xiaoxin Yin, and Jiawei Han. 2003. CPAR: Classification based on predictive association rules. In Proceedings of the 2003 SIAM International Conference on Data Mining, 331-335. https://doi.org/10.1137/1.9781611972733.40Google ScholarCross Ref
- Fadi Thabtah, Qazafi Mahmood, Lee McCluskey, and Hussein Abdel-Jaber. 2010. A new classification based on association algorithm. Journal of Information & Knowledge Management, 9(01): 55-64. https://doi.org/10.1142/S0219649210002486Google ScholarCross Ref
- Neda Abdelhamid, and Fadi Thabtah. 2014. Associative classification approaches: review and comparison. Journal of Information & Knowledge Management, 13(03): 1450027. https://doi.org/10.1142/S0219649214500270Google ScholarCross Ref
- Amanda Clare, and Ross D. King. 2001. Knowledge discovery in multi-label phenotype data. In European Conference on Principles of Data Mining and Knowledge Discovery, Springer. 42-53. https://doi.org/10.1007/3-540-44794-6_4Google Scholar
- Neda Abdelhamid, Aladdin Ayesh, and Fadi Thabtah. 2013. Associative classification mining for website phishing classification. In Proceedings on the International Conference on Artificial Intelligence (ICAI), The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing, 1-7.Google Scholar
- Hussein Abu-Mansour, Wa'el Hadi, T.L. McCluskey, and Fadi Abdeljaber Thabtah. 2010. Associative text categorisation rules pruning method. In Linguistic and Cognitive Approaches To Dialog Agents Symposium, AISB 2010 Convention, 29 March – 1 April 2010, De Montfort University, Leicester, UK. 39-44.Google Scholar
- Krikamol Muandet, Kenji Fukumizu, Bharath Sriperumbudur, and Bernhard Schölkopf. 2017. Kernel mean embedding of distributions: A review and beyond. Foundations and Trends in Machine Learning, 10(1-2): 1-141. https://doi.org/10.1561/2200000060Google ScholarDigital Library
- Wenmin Li, Jiawei Han, and Jian Pei. 2001. CMAR: Accurate and efficient classification based on multiple class-association rules. ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining, 369-376. https://doi.org/ 10.1109/ICDM.2001.989541Google Scholar
- Rakesh Agrawal, and Ramakrishnan Srikant. 1994. Fast algorithms for mining association rules. In Proc. 20th int. conf. very large data bases, VLDB, 1215:487-499.Google Scholar
- Bing Liu, Wynne Hsu, and Yiming Ma. 1998. Integrating classification and association rule mining. In Proceedings of the fourth international conference on knowledge discovery and data mining, 80-86.Google ScholarDigital Library
- Bing Liu, Yiming Ma, and Ching-Kian Wong. 2001. Classification using association rules: weaknesses and enhancements. In Data mining for scientific and engineering applications, Springer, 591-605. https://doi.org/10.1007/978-1-4615-1733-7_30Google Scholar
- Ramakrishnan Srikant, and Rakesh Agrawa. 1996. Mining quantitative association rules in large relational tables. In Acm Sigmod Record, ACM, 25:1-12. https://doi.org/10.1145/233269.233311Google ScholarDigital Library
- Keith C. C. Chan, and Wai-Ho Au. 1997. An effective algorithm for mining interesting quantitative association rules. In Symposium on Applied Computing: Proceedings of the 1997 ACM symposium on Applied computing,88-90. https://doi.org/10.1145/331697.331714Google ScholarDigital Library
- Dhrubajit Adhikary and Swarup Roy. 2015. Trends in quantitative association rule mining techniques. In 2015 IEEE 2nd International Conference on Recent Trends in Information Systems (ReTIS), 126-131. https://doi.org/10.1109/ReTIS.2015.7232865Google ScholarCross Ref
- Jiuyong Li, Hong, Shen, and Rodney Topor. 1999. An adaptive method of numerical attribute merging for quantitative association rule mining. In International Computer Science Conference, Springer, 41-50.Google ScholarCross Ref
- Dancheng Li, Ming Zhang, Shuangshuang Zhou, and Chen Zheng. 2012. A new approach of self-adaptive discretization to enhance the apriori quantitative association rule mining. In 2012 Second International Conference on Intelligent System Design and Engineering Application, IEEE, 44-47. https://doi.org/10.1109/ISdea.2012.540Google ScholarDigital Library
- Been-Chian Chien, Zin-Long Lin, and Tzung-Pei Hong. 2001. An efficient clustering algorithm for mining fuzzy quantitative association rules. In Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference, IEEE 3:1306-1311. https://doi.org/10.1109/NAFIPS.2001.943736Google Scholar
- Weining Zhang. 1999. Mining fuzzy quantitative association rules. In Proceedings 11th International Conference on Tools with Artificial Intelligence, 99-102. https://doi.org/10.1109/TAI.1999.809772Google ScholarCross Ref
- Attila Gyenesei. 2001. A fuzzy approach for mining quantitative association rules. Acta Cybernetica, 15(2):305-320.Google ScholarDigital Library
- Hui Zheng, Jing He, Guangyan Huang, and Yanchun Zhang. 2014. Optimized fuzzy association rule mining for quantitative data. In 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 396-403. https://doi.org/10.1109/FUZZ-IEEE.2014.6891735Google ScholarCross Ref
- Yiping Ke, James Cheng, and Wilfred Ng. 2006. Mic framework: an information-theoretic approach to quantitative association rule mining. In 22nd International Conference on Data Engineering (ICDE'06), 112-112. https://doi.org/10.1109/ICDE.2006.94Google ScholarDigital Library
- Alex Smola, Arthur Gretton, Le Song, Bernhard Schölkopf. 2007. A hilbert space embedding for distributions. In International Conference on Algorithmic Learning Theory, Springer, 13-31. https://doi.org/10.1007/978-3-540-75225-7_5Google ScholarDigital Library
- Le Song, Jonathan Huang, Alex Smola, and Kenji Fukumizu. 2009. Hilbert space embeddings of conditional distributions with applications to dynamical systems. In Proceedings of the 26th Annual International Conference on Machine Learning, ACM, 961-968. https://doi.org/10.1145/1553374.1553497Google ScholarDigital Library
- Kenji Fukumizu, Le Song, and Arthur Gretton. 2011. Kernel bayes' rule. In Advances in neural information processing systems, 1737-1745.Google Scholar
- Le Song, Kenji Fukumizu, and Arthur Gretton. 2013. Kernel embeddings of conditional distributions: A unified kernel framework for nonparametric inference in graphical models”, IEEE Signal Processing Magazine, 30(4):98-111. https://doi.org/10.1109/MSP.2013.2252713Google ScholarCross Ref
- David Lopez-Paz, Krikamol Muandet, Bernhard Schölkopf, and Iliya Tolstikhin. 2015. Towards a learning theory of cause-effect inference.In International Conference on Machine Learning, 37: 1452-1461.Google Scholar
- Dheeru Dua, and Casey Graff. 2017. UCI. Machine Learning Repository. http://archive.ics.uci.edu/ml/datasets.phpGoogle Scholar
- Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, 2011. Scikit-learn: Machine learning in Python. Journal of machine learning research, 12(Oct):2825-2830.Google ScholarDigital Library
Recommendations
Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values
Each type of classifier has its own advantages as well as certain shortcomings. In this paper, we take the advantages of the associative classifier and the Naive Bayes Classifier to make up the shortcomings of each other, thus improving the accuracy of ...
An improvement of text association classification using rules weights
ADMA'05: Proceedings of the First international conference on Advanced Data Mining and ApplicationsRecently, categorization methods based on association rules have been given much attention. In general, association classification has the higher accuracy and the better performance. However, the classification accuracy drops rapidly when the ...
A New Associative Classifier based on CFP-Growth++ Algorithm
ICCCT '15: Proceedings of the Sixth International Conference on Computer and Communication Technology 2015Association rule mining and Classification are the important task in data mining. The mining of frequent patterns with single minsup-based frequent pattern mining algorithms such as Apriori and FP-Growth leads to "rare item problem." So to overcome this ...
Comments