Abstract
Multi-objective cost-sensitive attribute reduction is an attractive problem in supervised machine learning. Most research has focused on single-objective minimal test cost reduction or dealt with symbolic data. In this paper, we propose a particle swarm optimization algorithm for the attribute reduction problem on numeric data with multiple costs and error ranges and use three metrics with which to evaluate the performance of the algorithm. The proposed algorithm benefits from a fitness function based on the positive region, the selected n types of the test cost, a set of constant weight values \(w_{i}^k\), and a designated non-positive exponent \(\lambda \). We design a learning strategy by setting dominance principles, which ensures the preservation of Pareto-optimal solutions and the rejection of redundant solutions. With different parameter settings, our PSO algorithm searches for a sub-optimal reduct set. Finally, we test our algorithm on seven UCI (University of California, Irvine) datasets. Comparisons with alternative approaches including the \(\lambda \)-weighted method and exhaustive calculation method of reduction are analyzed. Experimental results indicate that our heuristic algorithm outperforms existing algorithms.












Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Berry MJ, Linoff G (1997) Data mining techniques: for marketing, sales and customer support. Wiley, Hoboken
Bishop CM, Nasrabadi NM (2006) Pattern recognition and machine learning, vol 1. Springer, New York
Bradford J, Kunz C, Kohavi R (1998) Pruning decision trees with misclassification costs. In: Proceedings of the 10th European conference on machine learning, Berlin, pp 131–136
Chen YM, Miao DQ, Wang RZ (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recogn Lett 31:226–233
Drummond C, Holte R (2000) Exploiting the cost (in)sensitivity of decision tree splitting criteria. In: The 17th international conference on machine learning. Morgan Kaufmann, pp 239–246
Du Y, Hu Q, Zhu P, Ma P (2011) Rule learning for classification based on neighborhood covering reduction. Inf Sci 181(24):5457–5467
Fan W, Stolfo S, Zhang J, Chan P (1999) Adacost: misclassification cost-sensitive boosting, In: The 16th international conference on machine learning, Bled Slovenia, pp 97–105
Fang Y, Liu Z, Min F (2014) Multi-objective cost-sensitive attribute reduction on data with error ranges. Int J Mach Learn Cybern. doi:10.1007/s13042-014-0296-3
Fumera G, Roli F (2005) A theoretical and experimental analysis of linear combiners for multiple classifier systems. IEEE Trans Pattern Anal Mach Intell 27:942–956
Greiner R, Grove A, Roth D (2002) Learning cost-sensitive active classifiers. Artif Intell J 139(2):137–174
Han J, Kamber M (2001) Data mining: concepts and techniques. China Machine Press, Beijing
Hu Q, Yu D, Liu J, Wu C (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594
John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the 11th conference on uncertainty in artificial intelligence. Morgan Kaufmann, San Mateo, pp 338–345
Knoll U, Nakhaeizadeh G, Tausend B (1994) Cost-sensitive pruning of decision trees. In: Proceedings of the 8th European conference on machine learning. Catania, pp 383–386
Kukar M, Kononenko I (1998) Cost-sensitive learning with neural networks. In: Proceedings of the 13th European conference on artificial intelligence, pp 445–449
Ledley RS, Lusted LB (1959) Reasoning foundations of medical diagnosis. Science 130(3366):9–21
Li H-X, Zhou X-Z (2011) Risk decision making based on decision-theoretic rough set: a three-way view decision model. Int J Comput Intell Syst 4(1):1–11
Li L, Chen H, Zhu W (2014a) Attribute reduction in time-cost-sensitive decision systems through backtracking. J Info Comput Sci 11(2):597–606. doi:10.12733/jics20102790
Li J, Zhao H, Zhu W (2014b) Fast randomized algorithm with restart strategy for minimal test cost feature selection. Int J Mach Learn Cybernet 5(3):234–556
Liu J, Liao S, Min F, Zhu W (2012a) An improved genetic algorithm to minimal test cost reduction. In: Lin TY, Hu X, Wu Z, Chen ALP, Broder AZ, Ho H, Wang S (eds) IEEE international conference on granular computing (GrC), Washington, DC, pp 304–309
Liu J, Min F, Liao S, Zhu W (2012b) Minimal test cost feature selection with positive region constraint. In: Rough sets and current trends in computing, pp 259–266
Liu J, Liao S, Min F, Zhu W (2013) Test cost constraint attribute reduction through a genetic approach. J Inf Comput Sci 10(3):839–849
Liu D, Li T-R, Liang D-C (2014) Incorporating logistic regression to decision-theoretic rough sets for classifications. Int J Approximate Reasoning 55(1):197–210
Min F, Liu QH (2009) A hierarchical model for test-cost-sensitive decision systems. Inf Sci 179(14):2442–2452
Min F, Zhu W (2011) Minimal cost attribute reduction through backtracking. In: Tai-hoon K, Hojjat A, Alfredo C, Tughrul A, Yanchun Z, Jianhua M, Kyo-il C, Siti M, Xiaofeng S (eds) FGIT-database theory and application/bio-science and bio-technology, Springer, Berlin, Heidelberg, pp 100–107
Min F, Zhu W (2012) Attribute reduction of data with error ranges and test cost. Inf Sci 211(30):48–67
Min F, He H, Qiao Y, Zhu W (2011) Test-cost-sensitive attribute reduction. Inf Sci 181:4928–4942
Min F, Hu Q, Zhu W (2013) Feature selection with test cost constraint. Int J Approximate Reasoning 55(1):167–179
Nunez M (1991) The use of background knowledge in decision tree induction. Mach Learn 6(3):231–250
Pan G, Min F, Zhu W (2011) A genetic algorithm to the minimal test cost reduct problem. In: IEEE international conference on granular computing (GrC), pp 539–544
Pawlak Z (1982) Rough sets. Int J Inf Comput Sci 11(5):341–356
Pawlak Z (2002) Rough set theory and its applications. J Telecommun Inf Technol 3:7–10
Quinlan J (1986) Induction of decision trees. Mach Learn 1(1):81–106. doi:10.1023/A:1022643204877
Quinlan J (1993) C4.5: programs for machine learning. Morgan Kaufmann, Burlington
Salvatore G, Benedetto M, Roman S (2001) Rough sets theory for multicriteria decision analysis. Eur J Oper Res 129(1):1–47
Sun L, Xu J, Tian Y (2012) Feature selection using rough entropy-based uncertainty measures in incomplete decision systems. Knowl Based Syst 36:206–216
Turney P (1995) Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm. J Artif Intell Res 2:369–409
Turney P (2000) Types of cost in inductive concept learning. In: Workshop on cost-sensitive learning at the 17th international conference on machine learning, Stanford University, California
Wang G, Du H, Yang D (2002) Reduction of decision table based on condition information entropy. Chin J Comput 25(7):759–766
Xu B, Chen H, Zhu W (2013) Multi-objective cost-sensitive attribute reduction. In: Proceedings of the 2013 joint IFSA world congress and NAFIPS annual meeting, Canada, pp 1377–1381
Xu Z, Zhao H, Min F, Zhu W (2013) Ant colony optimization with three stages for independent test cost attribute reduction. Math Probl Eng. doi:10.1155/2013/510167
Xu B, Min F, Zhu W, Chen H (2014) A genetic algorithm to multi-objective cost-sensitive attribute reduction. J Comput Inf Syst 10(7):3011–3022
Xu J, Chen L, Min F (2014) Minimal test cost reduction through a randomized heuristic algorithm. J Inf Comput Sci 11(13):4555–4565
Yang Q, Wu X (2006) 10 Challenging problems in data mining research. Int J Inf Technol Decis Mak 5(04):597–604
Yang X-B, Qi Y-S, Song X-N, Yang J-Y (2013) Test cost sensitive multigranulation rough set: model and minimal cost selection. Inf Sci 250:184–199
Yao YY (2004) A partition model of granular computing. Lect Notes Comput Sci 3100:232–253
Yao YY, Wong S (1992) A decision theoretic framework for approximating concepts. Int J Man Mach Stud 37(6):793–809
Yao YY, Wong S, Lingras P (1990) A decision-theoretic rough set model. In: The 5th international symposium on methodologies for intelligent systems, pp 17–24
Yao YY, Zhao Y (2008) Attribute reduction in decision-theoretic rough set models. Inf Sci 178(17):3356–3373
Yao YY, Zhao Y, Wang J (2006) On reduct construction algorithms. In: Gavrilova ML, Tan CJK, Wang Y, Yao Y, Wang G (eds) Proceedings of rough set and knowledge technology, vol 4062, Springer, Berlin, Heidelberg, pp 297–304
Zhao H, Min F, Zhu W (2013) A backtracking approach to minimal cost feature selection of numerical data. J Inf Comput Sci 10(13):4105–4115
Zhao H, Min F, Zhu W (2013) Cost-sensitive feature selection of numeric data with measurement errors. J Appl Math 2013:1–13
Zhao H, Min F, Zhu W (2013) Test-cost-sensitive attribute reduction of data with normal distribution measurement errors. Math Probl Eng. doi:10.1155/2013/946070
Zhu W (2009) Relationship among basic concepts in covering-based rough sets. Inf Sci 17(14):2478–2486
Zhu W, Wang F (2003) Reduction and axiomization of covering generalized rough sets. Inf Sci 152(1):217–230
Zubek V, Dietterich T (2002) Pruning improves heuristic search for cost-sensitive learning. In: Proceedings of the 19th international conference on machine learning. Sydney, pp 27–34
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China under Grant No. 61379089, the Scientific Research Starting Project of SWPU under Grant No. 2014QHZ025, and the State Scholarship Fund of China under Grant No. 201508515156.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interests regarding the publication of this article.
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Fang, Y., Liu, ZH. & Min, F. A PSO algorithm for multi-objective cost-sensitive attribute reduction on numeric data with error ranges. Soft Comput 21, 7173–7189 (2017). https://doi.org/10.1007/s00500-016-2260-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-016-2260-5