Abstract
Test-cost-sensitive attribute reduction is an important component in data mining applications, and plays a key role in cost-sensitive learning. Some previous approaches in test-cost-sensitive attribute reduction focus mainly on homogeneous datasets. When heterogeneous datasets must be taken into account, the previous approaches convert nominal attribute to numerical attribute directly. In this paper, we introduce an adaptive neighborhood model for heterogeneous attribute and deal with test-cost-sensitive attribute reduction problem. In the adaptive neighborhood model, the objects with numerical attributes are dealt with classical covering neighborhood, and the objects with nominal attributes are dealt with the overlap metric neighborhood. Compared with the previous approaches, the proposed model can avoid that objects with different values of nominal attribute are classified into one neighborhood. The number of inconsistent objects of a neighborhood reflects the discriminating capability of an attribute subset. With the adaptive neighborhood model, an inconsistent objects-based heuristic reduction algorithm is constructed. The proposed algorithm is compared with the \(\lambda \)-weighted heuristic reduction algorithm which nominal attribute is normalized. Experimental results demonstrate that the proposed algorithm is more effective and more practical significance than the \(\lambda \)-weighted heuristic reduction algorithm.
Similar content being viewed by others
References
Andersen TL, Martinez TR (1995) Np-completeness of minimum rule sets. In: Proceedings of the 10th international symposium on computer and information sciences, Citeseer
Bianchi FM, Livi L, Rizzi A, Sadeghian A (2014) A granular computing approach to the design of optimized graph classification systems. Soft Comput 18(2):393–412
Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151(1):155–176
Greiner R, Grove AJ, Roth D (2002) Learning cost-sensitive active classifiers. Artif Intell 139(2):137–174
Hu QH, Yu DR, Xie Z (2008a) Neighborhood classifiers. Expert Syst Appl 34(2):866–876
Hu QH, Yu DR, Liu JF, Wu C (2008b) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594
Hunt EB, Marin J, Stone PJ (1966) Experiments in induction. Academic press, New York
Ji S, Carin L (2007) Cost-sensitive feature acquisition and classification. Pattern Recognit 40:1474–1485
Jia X, Liao W, Tang Z, Shang L (2013) Minimum cost attribute reduction in decision-theoretic rough set models. Inf Sci 219:151–167
Jing SY (2014) A hybrid genetic algorithm for feature subset selection in rough set theory. Soft Comput 18(7):1373–1382
Lanzi P (1997) Fast feature selection with genetic algorithms: a filter approach. In: Evolutionary computation
Lavrac N, Gamberger D, Turney P (1996) Cost-sensitive feature reduction applied to a hybrid genetic algorithm. In: Proceedings of the 7th international workshop on algorithmic learning theory, ALT
Li JH, Mei CL, Xu WH, Qian YH (2015) Concept learning via granular computing: a cognitive viewpoint. Inf Sci 298:447–467
Lin TY (1998) Granular computing on binary relations: data mining and neighborhood systems. In: Rough sets in knowledge discovery
Lin TY (2002) Granular computing on binary relations-analysis of conflict and Chinese wall security policy. Proc Rough Sets Curr Trends Comput 2475:296–299
Lin TY (2003) Granular computing–structures, representations, and applications. Lect Notes Artif Intell 2639:16–24
Ma LW (2012) On some types of neighborhood-related covering rough sets. Int J Approx Reason 53(6):901–911
Miao DQ, Zhao Y, Yao YY, Li H, Xu F (2009) Relative reducts in consistent and inconsistent decision tables of the Pawlak rough set model. Inf Sci 179(24):4140–4150
Min F, Liu QH (2009) A hierarchical model for test-cost-sensitive decision systems. Inf Sci 179:2442–2452
Min F, Zhu W (2012) Attribute reduction of data with error ranges and test costs. Inf Sci 211:48–67
Min F, He HP, Qian YH, Zhu W (2011) Test-cost-sensitive attribute reduction. Inf Sci 181:4928–4942
Min F, Zhu W, Zhao H, Xu ZL (2014) Coser: cost-sensitive rough sets. http://grc.fjzs.edu.cn/~fmin/coser/
Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11:341–356
Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishers, Boston
Pazzani M, Merz C, Ali PMK, Hume T, Brunk C (1994) Reducing misclassification costs. In: Proceedings of the 11th international conference of machine learning (ICML), Morgan Kaufmann
Qian J, Lv P, Yue X, Liu C, Jing Z (2015) Hierarchical attribute reduction algorithms for big data using mapreduce. Knowl Based Syst 73:18–31
Sanchez MA, Castillo O, Castro JR, Melin P (2014) Fuzzy granular gravitational clustering algorithm for multivariate data. Inf Sci 279:498–511
Sanchez MA, Castillo O, Castro JR (2015) Information granule formation via the concept of uncertainty-based information with interval type-2 fuzzy sets representation and Takagi-Sugeno-Kang consequents optimized with cuckoo search. Appl Soft Comput 27:602–609
Susmaga R (1999) Computation of minimal cost reducts. In: Ras Z, Skowron A (eds) Foundations of intelligent systems, vol 1609, pp 448–456
Tseng TLB, Huang CC (2007) Rough set-based approach to feature selection in customer relationship management. Omega 35(4):365–383
Weiss Y, Elovici Y, Rokach L (2013) The cash algorithm-cost-sensitive attribute selection using histograms. Inf Sci 222:247–268
Yang XB, Yu DJ, Yang JY, Song XN (2009) Difference relation-based rough set and negative rules in incomplete information system. Int J Uncertain Fuzziness Knowl Based Syst 17(05):649–665
Yao YY (2000) Information tables with neighborhood semantics. In: AeroSense 2000, international society for optics and photonics
Yao YY (2004) A partition model of granular computing. Lecture Notes in Computer Science, vol. 3100, pp 232–253
Yao YY, Zhao Y (2008) Attribute reduction in decision-theoretic rough set models. Inf Sci 178(17):3356–3373
Yao YY, Zhong N (2002) Granular computing using information tables. In: Data mining, rough sets and granular computing, pp 102–124
Zhang X, Mei C, Chen D, Li J (2013) Multi-confidence rule acquisition oriented attribute reduction of covering decision systems via combinatorial optimization. Knowl Based Syst 50:187–197
Zhang X, Mei C, Chen D, Li J (2014) Multi-confidence rule acquisition and confidence-preserved attribute reduction in interval-valued decision systems. Int J Approx Reason 55(8):1787–1804
Zhao H, Zhu W (2014) Optimal cost-sensitive granularization based on rough sets for variable costs. Knowl Based Syst 65:72–82
Zhao H, Min F, Zhu W (2011) Test-cost-sensitive attribute reduction based on neighborhood rough set. In: Proceedings of the 2011 IEEE international conference on granular computing
Zhong N, Dong JZ, Ohsuga S (2001) Using rough sets with heuristics to feature selection. J Intell Inf Syst 16(3):199–214
Zhu W (2007) Generalized rough sets based on relations. Inf Sci 177(22):4997–5011
Zhu W, Wang F (2003) Reduction and axiomatization of covering generalized rough sets. Inf Sci 152(1):217–230
Acknowledgments
This work is in part supported by the National Science Foundation of China under Grant Nos. 61379049, 61379089 and 61170128, and the Key Project of Education Department of Fujian Province under Grant Nos. JA13192 and JA14194.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Fan, A., Zhao, H. & Zhu, W. Test-cost-sensitive attribute reduction on heterogeneous data for adaptive neighborhood model. Soft Comput 20, 4813–4824 (2016). https://doi.org/10.1007/s00500-015-1770-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-015-1770-x