Abstract
Rule induction (RI) is one of the known classification approaches in data mining. RI extracts hidden patterns from instances in terms of rules. This paper proposes a logic-based rule induction (LBRI) classifier based on a switching function approach. LBRI generates binary rules by using a novel minimization function, which depends on simple and powerful bitwise operations. Initially, LBRI generates instance codes by encoding the dataset with standard binary code and then generates prime cubes (PC) for all classes from the instance codes by the proposed reduced offset method. Finally, LBRI selects the most effective PC of the current classes and adds them into the binary rule set that belongs to the current class. Each binary rule represents an If–Then rule for the rule induction classifiers. The proposed LBRI classifier is based on basic logic functions. It is a simple and effective method, and it can be used by intelligent systems to solve real-life classification/prediction problems in areas such as health care, online/financial banking, image/voice recognition, and bioinformatics. The performance of the proposed algorithm is compared to six rule induction algorithms; decision table, Ripper, C4.5, REPTree, OneR, and ICRM by using nineteen different datasets. The experimental results show that the proposed algorithm yields better classification accuracy than the other rule induction algorithms on ten out of nineteen datasets.
Similar content being viewed by others
References
Abdelhamid N, Ayesh A, Thabtah F (2014) Phishing detection based associative classification data mining. Expert Syst Appl 41(13):5948–5959
Alcalá-Fdez J, Fernández A, Luengo J, Derrac J, García S, Sánchez L, Herrera AF (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Mult-Valued Logic Soft Comput 17:255–287
An LP, Tong LY (2010) Binary relations as a basis for rule induction in presence of quantitative attributes. JCP 5(3):440–447
Bazan JG (1998) A comparison of dynamic and non-dynamic rough set methods for extracting laws from decision tables. Rough Sets Knowl Discov 1:321–365
Bertelsen R, Martinez TR (1994) Extending ID3 through discretization of continuous inputs. In: Proceedings of the 7th florida artificial intelligence research symposium, pp 122–125
Bieganowski J, Karatkevich A (2005) Heuristics for Thelen’s prime implicant method. Schedae Informaticae 14:125
Blake C (1995) UCI repository of machine learning databases. https://archive.ics.uci.edu/ml/index.php. Accessed 1 Sept 2018
Błaszczyński J, Słowiński R, Szeląg M (2011) Sequential covering rule induction algorithm for variable consistency rough set approaches. Inf Sci 181(5):987–1002. https://doi.org/10.1016/J.INS.2010.10.030
Borowik G, Kowalski K (2015) Rule induction based on frequencies of attribute values. In: Photonics applications in astronomy, communications, industry, and high-energy physics experiments, 2015. https://doi.org/10.1117/12.2205899
Brayton RK, Hachtel GD, McMullen C, Sangiovanni-Vincentelli A (1984) Logic minimization algorithms for VLSI synthesis. Springer, Berlin
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. The Wadsworth statisticsprobability series. Wadsworth International Group, Belmont, CA
Cai J (2006) Decision tree pruning using expert knowledge. University of Akron, Akron
Cano A, Zafra A, Ventura S (2013) An interpretable classification rule mining algorithm. Inf Sci 240:1–20. https://doi.org/10.1016/J.INS.2013.03.038
Carneiro N, Figueira G, Costa M (2017) A data mining based system for credit-card fraud detection in e-tail. Decis Support Syst 95:91–101
Cercone N, An A, Chan C (1999) Rule-induction and case-based reasoning: hybrid architectures appear advantageous. IEEE Trans Knowl Data Eng 11(1):166–174. https://doi.org/10.1109/69.755625
Chen C (2015) Handbook of pattern recognition and computer vision. World Scientific, Singapore
Cireşan D, Meier U (2015) Multi-column deep neural networks for offline handwritten Chinese character classification. In: Neural networks (IJCNN), 2015 international joint conference, pp 1–6. https://doi.org/10.1109/IJCNN.2015.7280516
Clark P, Niblett T (1989) The CN2 induction algorithm. Mach Learn 3(4):261–283
Cohen WW (1995) Fast effective rule induction. In: Proceedings of the twelfth international conference on machine learning, pp 115–123
Cramer GM, Ford RA, Hall RL (1976) Estimation of toxic hazard—a decision tree approach. Food Cosmet Toxicol 16(3):255–276. https://doi.org/10.1016/S0015-6264(76)80522-6
de Bruijne M (2016) Machine learning approaches in medical image analysis: from detection to diagnosis. Med Image Anal 33:94–97. https://doi.org/10.1016/j.media.2016.06.032
Domingos P (2000) Bayesian averaging of classifiers and the overfitting problem. In: ICML, 2000, pp 223–230. Retrieved from https://homes.cs.washington.edu/~pedrod/papers/mlc00b.pdf. Accessed 1 Sept 2018
Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. Mach Learn Proc 1995:194–202. https://doi.org/10.1016/B978-1-55860-377-6.50032-3
Dutta I, Dutta S, Raahemi B (2017) Detecting financial restatements using data mining techniques. Expert Syst Appl 90:374–393. https://doi.org/10.1016/j.eswa.2017.08.030
García S, Molina D, Lozano M, Herrera F (2009) A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: a case study on the CEC’2005 special session on real parameter optimization. J Heuristics 15(6):617–644. https://doi.org/10.1007/s10732-008-9080-4
García DL, Nebot À, Vellido A (2017) Intelligent data analysis approaches to churn as a business problem: a survey. Knowl Inf Syst 51(3):719–774. https://doi.org/10.1007/s10115-016-0995-z
Grzymala-Busse JW, Stefanowski J (2001) Three discretization methods for rule induction. Int J Intell Syst 16(1):29–38
Hacibeyoglu M, Ibrahim MH (2018) EF_Unique: an improved version of unsupervised equal frequency discretization method. Arab J Sci Eng 43(12):7695–7704. https://doi.org/10.1007/s13369-018-3144-z
Hacibeyoglu M, Basciftci F, Kahramanli S (2011) A logic method for efficient reduction of the space complexity of the attribute reduction problem. Turk J Electr Eng Comput Sci 19(4):643–656. https://doi.org/10.3906/elk-1008-726
Hall M, Holmes G (2003) Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans Knowl Data Eng 15(6):1437–1447. https://doi.org/10.1109/TKDE.2003.1245283
Hall M, Frank E, Holmes G, Pfahringer B (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam
Han J, Pei J, Kamber M (2012) Statistical comparisons of classifiers over multiple data sets, vol 7. Elsevier, Amsterdam
Hong S (1997) R-MINI: an iterative approach for generating minimal rules from examples. IEEE Trans Knowl Data Eng 9(5):709–717
Huang S, Xing H (2002) Extract intelligible and concise fuzzy rules from neural networks. Fuzzy Sets Syst 132(2):233–243. https://doi.org/10.1016/S0165-0114(01)00239-1
Iman S, Pedram M (1998) Logic synthesis for low power VLSI designs. Springer Science & Business Media
Jakubczyc J (2005) The ant colony algorithms for rule induction. In: Proceedings of AIML, pp 112–117
Kahramanli S (2015) A novel approach to logic-based sequential cover strategy. In: International technology management conference (ITMC2015), pp 48–53
Kusunoki Y, Inuiguchi M, Stefanowski J (2008) Rule induction via clustering decision classes. Int J Innov Comput Inf Control 4(10):2663–2677
Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502. https://doi.org/10.1109/TKDE.2005.66
Liu J, Hu Q, Yu D (2008) A weighted rough set based method developed for class imbalance learning. Inf Sci 178(4):1235–1256. https://doi.org/10.1016/j.ins.2007.10.002
Malik A, Brayton R, Newton A (1991) Reduced offsets for minimization of binary-valued functions. IEEE Trans Comput Aided Des Integr Circuits Syst 10(4):413–426. https://doi.org/10.1109/43.75625
Michalski RS, Carbonell JG, Mitchell TM (1983) Machine learning: an artificial intelligence approach. Springer, Berlin
Micheli G (1994) Synthesis and optimization of digital circuits. McGraw-Hill Higher Education, New York
Miller R (1979) Switching theory. Krieger, Malabar
Mingers J (1989) An empirical comparison of pruning methods for decision tree induction. Mach Learn 4(2):227–243. https://doi.org/10.1023/A:1022604100933
Muresan S, Tzoukermann E, Klavans J (2001) Combining linguistic and machine learning techniques for email summarization. In: Proceedings of the 2001 workshop on computational natural language learning, vol 7
Muselli M, Liberati D (2002) Binary rule generation via hamming clustering. IEEE Trans Knowl Data Eng 14(6):1258–1268. https://doi.org/10.1109/TKDE.2002.1047766
Nabwey HA (2011) A probabilistic rough set approach to rule discovery. Int J Adv Sci Technol 30:25–34. https://doi.org/10.1007/978-3-642-20975-8_7
Nayab N (2011) Disadvantages to using decision trees. https://www.brighthubpm.com/project-planning/106005-disadvantages-to-using-decision-trees/. Accessed 1 Sept 2018
Nelson R (1955) Simplest normal truth functions. J Symb Logic 20(2):105–108
Pal S, Skowron A (1999) Rough-fuzzy hybridization: a new trend in decision making. Springer, New York
Paul A, Sil J, Mukhopadhyay C (2017) Gene selection for designing optimal fuzzy rule base classifier by estimating missing value. Appl Soft Comput 55:276–288. https://doi.org/10.1016/j.asoc.2017.01.046
Pawlak Z (1997) Rough set approach to knowledge-based decision support. Eur J Oper Res 99(1):48–57. https://doi.org/10.1016/S0377-2217(96)00382-7
Pedro D (1995) Rule induction and instance-based learning a unified approach. In: Proceedings of the 14th international joint conference on artificial intelligence, pp 1226–1232. Springer, Berlin. https://doi.org/10.1007/978-3-540-31880-4_35
Rong T, Gong H, Ng WWY (2014) Stochastic sensitivity oversampling technique for imbalanced data. In: International conference on machine learning and cybernetics, pp 161–171. Springer, Berlin. https://doi.org/10.1007/978-3-662-45652-1_18
Shiva SG (1998) Introduction to logic design, 2nd edn, CRC Press
Smyth P, Goodman R (1992) An information theoretic approach to rule induction from databases. IEEE Trans Knowl Data Eng 4(4):301–316. https://doi.org/10.1109/69.149926
Tan P-N, Steinbach M, Kumar V (2006) Introduction to data mining instructor’s solution manual. Retrieved from https://www-users.cs.umn.edu/~kumar001/dmbook/sol.pdf. Accessed 1 Sept 2018
Thelen B (1981) Investigations of algorithms for computer-aided logic design of digital circuits. PhD thesis, ITIV, Univ. of Karlsruhe
Vun CH, Premkumar B (2012) Thermometer code based modular arithmetic. In: 2012 spring congress on engineering and technology, pp 1–5. IEEE. https://doi.org/10.1109/SCET.2012.6342081
Yang Z-Q, Xiao X, Gao H (2007) An improved DM algorithm based on rough set theory. In: International conference on wireless communications, networking and mobile computing, pp 3097–3100. IEEE. https://doi.org/10.1109/WICOM.2007.769
Zhang D, Zhou L (2004) Discovering golden nuggets: data mining in financial application. IEEE Trans Syst Man Cybern C: Appl Rev 34(4):513–522. https://doi.org/10.1109/TSMCC.2004.829279
Zhang J, Williams SO, Wang H (2017) Intelligent computing system based on pattern recognition and data mining algorithms. Sustain Comput Inform Syst 1:2. https://doi.org/10.1016/j.suscom.2017.10.010
Zhao X (2011) A classification rule acquisition algorithm based on constrained concept lattice. Artif Intell Comput Intell 7002:356–363
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not refer to studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ibrahim, M.H., Hacibeyoglu, M. A novel switching function approach for data mining classification problems. Soft Comput 24, 4941–4957 (2020). https://doi.org/10.1007/s00500-019-04246-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-04246-2