Skip to main content

Advertisement

Log in

A PSO algorithm for multi-objective cost-sensitive attribute reduction on numeric data with error ranges

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Multi-objective cost-sensitive attribute reduction is an attractive problem in supervised machine learning. Most research has focused on single-objective minimal test cost reduction or dealt with symbolic data. In this paper, we propose a particle swarm optimization algorithm for the attribute reduction problem on numeric data with multiple costs and error ranges and use three metrics with which to evaluate the performance of the algorithm. The proposed algorithm benefits from a fitness function based on the positive region, the selected n types of the test cost, a set of constant weight values \(w_{i}^k\), and a designated non-positive exponent \(\lambda \). We design a learning strategy by setting dominance principles, which ensures the preservation of Pareto-optimal solutions and the rejection of redundant solutions. With different parameter settings, our PSO algorithm searches for a sub-optimal reduct set. Finally, we test our algorithm on seven UCI (University of California, Irvine) datasets. Comparisons with alternative approaches including the \(\lambda \)-weighted method and exhaustive calculation method of reduction are analyzed. Experimental results indicate that our heuristic algorithm outperforms existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Berry MJ, Linoff G (1997) Data mining techniques: for marketing, sales and customer support. Wiley, Hoboken

    Google Scholar 

  • Bishop CM, Nasrabadi NM (2006) Pattern recognition and machine learning, vol 1. Springer, New York

    MATH  Google Scholar 

  • Bradford J, Kunz C, Kohavi R (1998) Pruning decision trees with misclassification costs. In: Proceedings of the 10th European conference on machine learning, Berlin, pp 131–136

  • Chen YM, Miao DQ, Wang RZ (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recogn Lett 31:226–233

    Article  Google Scholar 

  • Drummond C, Holte R (2000) Exploiting the cost (in)sensitivity of decision tree splitting criteria. In: The 17th international conference on machine learning. Morgan Kaufmann, pp 239–246

  • Du Y, Hu Q, Zhu P, Ma P (2011) Rule learning for classification based on neighborhood covering reduction. Inf Sci 181(24):5457–5467

  • Fan W, Stolfo S, Zhang J, Chan P (1999) Adacost: misclassification cost-sensitive boosting, In: The 16th international conference on machine learning, Bled Slovenia, pp 97–105

  • Fang Y, Liu Z, Min F (2014) Multi-objective cost-sensitive attribute reduction on data with error ranges. Int J Mach Learn Cybern. doi:10.1007/s13042-014-0296-3

    Google Scholar 

  • Fumera G, Roli F (2005) A theoretical and experimental analysis of linear combiners for multiple classifier systems. IEEE Trans Pattern Anal Mach Intell 27:942–956

    Article  Google Scholar 

  • Greiner R, Grove A, Roth D (2002) Learning cost-sensitive active classifiers. Artif Intell J 139(2):137–174

    Article  MathSciNet  Google Scholar 

  • Han J, Kamber M (2001) Data mining: concepts and techniques. China Machine Press, Beijing

    MATH  Google Scholar 

  • Hu Q, Yu D, Liu J, Wu C (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594

    Article  MATH  MathSciNet  Google Scholar 

  • John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the 11th conference on uncertainty in artificial intelligence. Morgan Kaufmann, San Mateo, pp 338–345

  • Knoll U, Nakhaeizadeh G, Tausend B (1994) Cost-sensitive pruning of decision trees. In: Proceedings of the 8th European conference on machine learning. Catania, pp 383–386

  • Kukar M, Kononenko I (1998) Cost-sensitive learning with neural networks. In: Proceedings of the 13th European conference on artificial intelligence, pp 445–449

  • Ledley RS, Lusted LB (1959) Reasoning foundations of medical diagnosis. Science 130(3366):9–21

    Article  MATH  Google Scholar 

  • Li H-X, Zhou X-Z (2011) Risk decision making based on decision-theoretic rough set: a three-way view decision model. Int J Comput Intell Syst 4(1):1–11

    Article  MathSciNet  Google Scholar 

  • Li L, Chen H, Zhu W (2014a) Attribute reduction in time-cost-sensitive decision systems through backtracking. J Info Comput Sci 11(2):597–606. doi:10.12733/jics20102790

  • Li J, Zhao H, Zhu W (2014b) Fast randomized algorithm with restart strategy for minimal test cost feature selection. Int J Mach Learn Cybernet 5(3):234–556

  • Liu J, Liao S, Min F, Zhu W (2012a) An improved genetic algorithm to minimal test cost reduction. In: Lin TY, Hu X, Wu Z, Chen ALP, Broder AZ, Ho H, Wang S (eds) IEEE international conference on granular computing (GrC), Washington, DC, pp 304–309

  • Liu J, Min F, Liao S, Zhu W (2012b) Minimal test cost feature selection with positive region constraint. In: Rough sets and current trends in computing, pp 259–266

  • Liu J, Liao S, Min F, Zhu W (2013) Test cost constraint attribute reduction through a genetic approach. J Inf Comput Sci 10(3):839–849

  • Liu D, Li T-R, Liang D-C (2014) Incorporating logistic regression to decision-theoretic rough sets for classifications. Int J Approximate Reasoning 55(1):197–210

    Article  MATH  MathSciNet  Google Scholar 

  • Min F, Liu QH (2009) A hierarchical model for test-cost-sensitive decision systems. Inf Sci 179(14):2442–2452

    Article  MATH  MathSciNet  Google Scholar 

  • Min F, Zhu W (2011) Minimal cost attribute reduction through backtracking. In: Tai-hoon K, Hojjat A, Alfredo C, Tughrul A, Yanchun Z, Jianhua M, Kyo-il C, Siti M, Xiaofeng S (eds) FGIT-database theory and application/bio-science and bio-technology, Springer, Berlin, Heidelberg, pp 100–107

  • Min F, Zhu W (2012) Attribute reduction of data with error ranges and test cost. Inf Sci 211(30):48–67

    Article  MATH  MathSciNet  Google Scholar 

  • Min F, He H, Qiao Y, Zhu W (2011) Test-cost-sensitive attribute reduction. Inf Sci 181:4928–4942

    Article  Google Scholar 

  • Min F, Hu Q, Zhu W (2013) Feature selection with test cost constraint. Int J Approximate Reasoning 55(1):167–179

    Article  MATH  MathSciNet  Google Scholar 

  • Nunez M (1991) The use of background knowledge in decision tree induction. Mach Learn 6(3):231–250

    Google Scholar 

  • Pan G, Min F, Zhu W (2011) A genetic algorithm to the minimal test cost reduct problem. In: IEEE international conference on granular computing (GrC), pp 539–544

  • Pawlak Z (1982) Rough sets. Int J Inf Comput Sci 11(5):341–356

    Article  MATH  Google Scholar 

  • Pawlak Z (2002) Rough set theory and its applications. J Telecommun Inf Technol 3:7–10

    Google Scholar 

  • Quinlan J (1986) Induction of decision trees. Mach Learn 1(1):81–106. doi:10.1023/A:1022643204877

  • Quinlan J (1993) C4.5: programs for machine learning. Morgan Kaufmann, Burlington

    Google Scholar 

  • Salvatore G, Benedetto M, Roman S (2001) Rough sets theory for multicriteria decision analysis. Eur J Oper Res 129(1):1–47

    Article  MATH  Google Scholar 

  • Sun L, Xu J, Tian Y (2012) Feature selection using rough entropy-based uncertainty measures in incomplete decision systems. Knowl Based Syst 36:206–216

    Article  Google Scholar 

  • Turney P (1995) Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm. J Artif Intell Res 2:369–409

    Google Scholar 

  • Turney P (2000) Types of cost in inductive concept learning. In: Workshop on cost-sensitive learning at the 17th international conference on machine learning, Stanford University, California

  • Wang G, Du H, Yang D (2002) Reduction of decision table based on condition information entropy. Chin J Comput 25(7):759–766

    Google Scholar 

  • Xu B, Chen H, Zhu W (2013) Multi-objective cost-sensitive attribute reduction. In: Proceedings of the 2013 joint IFSA world congress and NAFIPS annual meeting, Canada, pp 1377–1381

  • Xu Z, Zhao H, Min F, Zhu W (2013) Ant colony optimization with three stages for independent test cost attribute reduction. Math Probl Eng. doi:10.1155/2013/510167

  • Xu B, Min F, Zhu W, Chen H (2014) A genetic algorithm to multi-objective cost-sensitive attribute reduction. J Comput Inf Syst 10(7):3011–3022

    Google Scholar 

  • Xu J, Chen L, Min F (2014) Minimal test cost reduction through a randomized heuristic algorithm. J Inf Comput Sci 11(13):4555–4565

    Article  Google Scholar 

  • Yang Q, Wu X (2006) 10 Challenging problems in data mining research. Int J Inf Technol Decis Mak 5(04):597–604

    Article  Google Scholar 

  • Yang X-B, Qi Y-S, Song X-N, Yang J-Y (2013) Test cost sensitive multigranulation rough set: model and minimal cost selection. Inf Sci 250:184–199

    Article  MATH  MathSciNet  Google Scholar 

  • Yao YY (2004) A partition model of granular computing. Lect Notes Comput Sci 3100:232–253

    Article  MATH  Google Scholar 

  • Yao YY, Wong S (1992) A decision theoretic framework for approximating concepts. Int J Man Mach Stud 37(6):793–809

    Article  Google Scholar 

  • Yao YY, Wong S, Lingras P (1990) A decision-theoretic rough set model. In: The 5th international symposium on methodologies for intelligent systems, pp 17–24

  • Yao YY, Zhao Y (2008) Attribute reduction in decision-theoretic rough set models. Inf Sci 178(17):3356–3373

    Article  MATH  MathSciNet  Google Scholar 

  • Yao YY, Zhao Y, Wang J (2006) On reduct construction algorithms. In: Gavrilova ML, Tan CJK, Wang Y, Yao Y, Wang G (eds) Proceedings of rough set and knowledge technology, vol 4062, Springer, Berlin, Heidelberg, pp 297–304

  • Zhao H, Min F, Zhu W (2013) A backtracking approach to minimal cost feature selection of numerical data. J Inf Comput Sci 10(13):4105–4115

    Article  Google Scholar 

  • Zhao H, Min F, Zhu W (2013) Cost-sensitive feature selection of numeric data with measurement errors. J Appl Math 2013:1–13

    MATH  Google Scholar 

  • Zhao H, Min F, Zhu W (2013) Test-cost-sensitive attribute reduction of data with normal distribution measurement errors. Math Probl Eng. doi:10.1155/2013/946070

    MATH  MathSciNet  Google Scholar 

  • Zhu W (2009) Relationship among basic concepts in covering-based rough sets. Inf Sci 17(14):2478–2486

    Article  MATH  MathSciNet  Google Scholar 

  • Zhu W, Wang F (2003) Reduction and axiomization of covering generalized rough sets. Inf Sci 152(1):217–230

    Article  MATH  MathSciNet  Google Scholar 

  • Zubek V, Dietterich T (2002) Pruning improves heuristic search for cost-sensitive learning. In: Proceedings of the 19th international conference on machine learning. Sydney, pp 27–34

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant No. 61379089, the Scientific Research Starting Project of SWPU under Grant No. 2014QHZ025, and the State Scholarship Fund of China under Grant No. 201508515156.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu Fang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interests regarding the publication of this article.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fang, Y., Liu, ZH. & Min, F. A PSO algorithm for multi-objective cost-sensitive attribute reduction on numeric data with error ranges. Soft Comput 21, 7173–7189 (2017). https://doi.org/10.1007/s00500-016-2260-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-016-2260-5

Keywords

Navigation