Skip to main content

Advertisement

Log in

OFP_CLASS: a hybrid method to generate optimized fuzzy partitions for classification

  • Original Paper
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The discretization of values plays a critical role in data mining and knowledge discovery. The representation of information through intervals is more concise and easier to understand at certain levels of knowledge than the representation by mean continuous values. In this paper, we propose a method for discretizing continuous attributes by means of fuzzy sets, which constitute a fuzzy partition of the domains of these attributes. This method carries out a fuzzy discretization of continuous attributes in two stages. A fuzzy decision tree is used in the first stage to propose an initial set of crisp intervals, while a genetic algorithm is used in the second stage to define the membership functions and the cardinality of the partitions. After defining the fuzzy partitions, we evaluate and compare them with previously existing ones in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Antonelli M, Ducange P, Lazzerini B, Marcelloni F (2010) Learning knowledge bases of multi-objective evolutionary fuzzy systems by simultaneously optimizing accuracy, complexity and partition integrity. Soft Comput. doi:10.1007/s00500-010-0665-0

  • Asuncion A, Newman DJ (2007) UCI Machine Learning Repository, http://www.ics.uci.edu/mlearn/MLRepository.html, University of California, School of Information and Computer Science, Irvine, CA

  • Au W-H, Chan KC, Wong A (2006) A fuzzy approach to partitioning continuous attributes for classification. IEEE Trans Knowl Data Eng 18(5):715–719

    Article  Google Scholar 

  • Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B 57:289–300

    Google Scholar 

  • Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York

    MATH  Google Scholar 

  • Boulle M (2004) Khiops: a statistical discretization method of continuous attributes. Mach Learn 55:53–69

    Article  MATH  Google Scholar 

  • Catlett J (1991) On changing continuous attributes into ordered discrete attributes. In: Proceedings of Fifth European Working Session on Learning. Porto, Portugal, pp 164–177

  • Chan CC, Bartur C, Srinivasasn A (1991) Determination of Quantization Intervals in Rule Based Model for Dynamic Systems. In: Proceedings of IEEE Conference on System, Man, and Cybernetics. Charlottesville, VA , USA, pp 1719–1723

  • Choi YS, Moon BR (2007) Feature Selection in Genetic Fuzzy Discretization for Pattern Classification Problems. IEICE Trans Inform Syst E90-D(7):1047–1054

  • Cox E, Taber R, O’Hagan M (1998) The fuzzy systems handbook. 2nd edn. AP Professional, Oswego (2nd Bk&Cd edition)

  • Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. In: Proceedings of the Twelfth International Conference on Machine Learning, Tahoe City, California, USA, pp 194–202

  • Fayyad UM, Irani KB (1993) Multi-interval discretization of continuous-valued attributes in decision tree generation. In: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, Chambéry, France, pp 1022–1027

  • Fayyad UM, Piatetsky-Shapiro G, Smyth P (1996) From data mining to Knoweledge Discovery: An Overview. In: Advances in Knoweledge Discovery and Data Mining, U.M. Fayyad, G Piatetsky-Shapiro, P Smyth P, Uthrusamy R (eds.), AAAI/MIT Press, Massachusetts, pp 1–34

  • García S, Fernández A, Luengo J, Herrera F (2009) A study statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10):959–977

    Article  Google Scholar 

  • Gustafson DE, Kessel WC (1979) Fuzzy clustering with a fuzzy covariance matrix. In: Proceedins of IEEE Conference on Decision and Control, San Diego, CA, pp 761–766

  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA Data Mining Software: An Update. ACM SIGKDD Explor Newslett 11(1):10–18

    Article  Google Scholar 

  • Goldberg D E (1989) Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Longman Publishing Co., MA, USA

  • Ho KM, Scott PD (1997) Zeta: A Global Method for Discretization of Continuous Variables. In: Proceedings of 3rd International Conference on Knowledge Discovery and Data Mining, Newport Beach, California, pp 191–194

  • Holte RC (1993) Very simple classification rules perform well on most on most commonly used datasets. Mach Learn 11:63–90

    Google Scholar 

  • Ihaka R, Gentleman R (1996) R: A Language for Data Analysis and Graphics. J Comput Graph Stat 5(3):299–314. http://www.r-project.org/

  • Janikov CZ (1999) Fuzzy partitionig with fid 3.1. In: Proceedings of 18th International Conference of the North American Fuzzy Information Processing Society, New York, USA, pp 467–471

  • Kbir MA, Maalmi K, Benslimane R, Benkirane H (2000) Hierarchical fuzzy partition for pattern classification with fuzzy if-then rules. Pattern Recognit Lett 21(6–7):503–509

    Article  Google Scholar 

  • Kerber R (1992) ChiMerge: Discretization of Numeric Attributes. In: Proceedings of Tenth Conf. Artificial Intelligence, CA, USA, pp 123–128

  • Khan SS, Ahmad A (2004) Cluster center initialization algorithm for K-means clustering. Pattern Recognit Lett 25(11):1293–1302

    Article  Google Scholar 

  • Kurgan L, Cios KJ (2004) CAIM discretization algorithm. IEEE Trans Knowl Data Eng 16(2):145–153

    Article  Google Scholar 

  • Li Ch (2009) A Combination Scheme for Fuzzy Partitions Based on Fuzzy Majority Voting Rule. In: Proceedings of International Conference on Networks Security, Wireless Communications and Trusted Computing. Wuhan, Hubei, China, pp 675–678

  • Li Ch, Wang Y, Dai H (2009) A Combination Scheme for Fuzzy Partitions Based on Fuzzy Weighted Majority Voting Rule. In: Proceedings of International Conference on Digital Image Processing. Bangkok, Thailand, pp 3–7

  • Li Ch, Wang Y, Zuo Y (2009) A Selection Model for Optimal Fuzzy Clustering Algorithm and Number of Cluster Based on Competitive Comprehensive Fuzzy Evaluation. IEEE Trans Fuzzy Syst 17:568–577

  • Liu H, Setiono R (1997) Feature Selection via Discretization. IEEE Trans Knowl Data Eng 9(4):642–645

    Google Scholar 

  • Liu H, Hussain F, Tan CL, Dash M (2002) Discretization: an enabling technique. J Data Min Knowl Discov 6(4):393–423

    Google Scholar 

  • Marzuki Z, Ahmad F (2007) Data Mining Discretization Methods and Performances. In: Proceedings of the International Conference on Electrical Engineering and Informatics. Bandung, Indonesia, pp 535–537

  • Mirkin B (1996) Mathematical classification and clustering. Kluwer Academic Publishers, Netherlands

    Book  MATH  Google Scholar 

  • Mirkin B, Satarov G (1990) Method of fuzzy additive types for analysis of multidimensional data: I, II. Autom Remote Control 51(5–6):683–688, 817–821

    Google Scholar 

  • Myles AJ, Brown SD (2003) Induction of Decision Trees Using Fuzzy Partitions. J Chemom 17:531–536

    Google Scholar 

  • Nascimento S, Mirkin B, Moura-Pires F (2000) A fuzzy clustering model of data and fuzzy c-means. In: Proceedings of the IEEE Conference on Fuzzy Systems, San Antonio, TX, USA, pp 302–307

  • Peng YH, Flach PA (2001) Soft Discretization to Enhance the Continuous Decision Tree Induction. In: Proceedings of ECML/PKDD-2001 Workshop IDDM-2001, Freiburg, Germany, pp 109–118

  • Piñero PY, Arco L, García MM, Acevedo L (2003) Algoritmos Genéticos en la construcción de funciones de pertenencia borrosas. Revista Iberoamericana de Inteligencia Artificial 18:25–33

    Google Scholar 

  • Quilan JR (1986) Induction of decision trees. Mach Learn 1:81–106

    Google Scholar 

  • Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann Publishers, San Francisco, CA

  • Redmond SJ, Heneghan C (2007) A method for initializing the K-means clustering algorithm using kd-tree. Pattern Recognit Lett 28:965–973

    Article  Google Scholar 

  • Sriparna S, Sanghamitra B (2007) A Fuzzy Genetic Clustering Technique Using a New Symmetry Based Distance for Automatic Evolution of Clusters. In: Proceedings of International Conference on Computing: Theory and Applications, Kolkata, India, pp 309–314

  • Torra V (2005) Fuzzy C-Means for Fuzzy Hierarchical Clustering. In: Proceedings of IEEE International Conference on Fuzzy Systems, Reno, Nevada, USA, pp 646–651

  • Tsai CJ, Lee CI, Yang WP (2008) A discretization algorithm based on class-attribute contingency coefficient. Inf Sci 178:714–731

    Article  Google Scholar 

  • Umano M, Okamolo H, Hatono I, Tamura H (1994) Fuzzy decision trees by fuzzy ID3 algorithm and its application to Diagnosis System. In: Proceedings of Third IEEE Intl. Conf. Fuzzy Systems, Orlando, USA, pp 2113–2118

  • Wu KL, Yang MS (2002) Alternative C-means Clustering Algorithm. Pattern Recognit 35(1):2267–2278

    Article  MATH  Google Scholar 

  • Yang Y, Jia Z, Chang C, Qin X, Li T, Wang H, Zhao J (2008) An efficient fuzzy kohonen clustering network algorithm. In: Proceedings of Fuzzy Systems and Knowledge Discovery, Shandong, China, pp 510–513

  • Zadeh LA (1975) The Concept of a Linguistic Variable and its Application to Approximate Reasoning-I. Inf Sci 8(3):199–249

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

Supported by the projects TIN2008-06872-C04-03 and TIN2011-27696-C02-02 of the MICINN of Spain and European Fund for Regional Development. Thanks also to “Fundación Séneca” (Spain) for the Funding Program for Research Groups of Excellence (04552/GERM/06) and the support given to R. Martínez by FPI scholarship program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jose M. Cadenas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cadenas, J.M., Garrido, M.C., Martínez, R. et al. OFP_CLASS: a hybrid method to generate optimized fuzzy partitions for classification. Soft Comput 16, 667–682 (2012). https://doi.org/10.1007/s00500-011-0778-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-011-0778-0

Keywords

Navigation