Skip to main content
Log in

Test-cost-sensitive attribute reduction on heterogeneous data for adaptive neighborhood model

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Test-cost-sensitive attribute reduction is an important component in data mining applications, and plays a key role in cost-sensitive learning. Some previous approaches in test-cost-sensitive attribute reduction focus mainly on homogeneous datasets. When heterogeneous datasets must be taken into account, the previous approaches convert nominal attribute to numerical attribute directly. In this paper, we introduce an adaptive neighborhood model for heterogeneous attribute and deal with test-cost-sensitive attribute reduction problem. In the adaptive neighborhood model, the objects with numerical attributes are dealt with classical covering neighborhood, and the objects with nominal attributes are dealt with the overlap metric neighborhood. Compared with the previous approaches, the proposed model can avoid that objects with different values of nominal attribute are classified into one neighborhood. The number of inconsistent objects of a neighborhood reflects the discriminating capability of an attribute subset. With the adaptive neighborhood model, an inconsistent objects-based heuristic reduction algorithm is constructed. The proposed algorithm is compared with the \(\lambda \)-weighted heuristic reduction algorithm which nominal attribute is normalized. Experimental results demonstrate that the proposed algorithm is more effective and more practical significance than the \(\lambda \)-weighted heuristic reduction algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Andersen TL, Martinez TR (1995) Np-completeness of minimum rule sets. In: Proceedings of the 10th international symposium on computer and information sciences, Citeseer

  • Bianchi FM, Livi L, Rizzi A, Sadeghian A (2014) A granular computing approach to the design of optimized graph classification systems. Soft Comput 18(2):393–412

    Article  Google Scholar 

  • Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151(1):155–176

    Article  MATH  MathSciNet  Google Scholar 

  • Greiner R, Grove AJ, Roth D (2002) Learning cost-sensitive active classifiers. Artif Intell 139(2):137–174

    Article  MathSciNet  Google Scholar 

  • Hu QH, Yu DR, Xie Z (2008a) Neighborhood classifiers. Expert Syst Appl 34(2):866–876

  • Hu QH, Yu DR, Liu JF, Wu C (2008b) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594

  • Hunt EB, Marin J, Stone PJ (1966) Experiments in induction. Academic press, New York

    Google Scholar 

  • Ji S, Carin L (2007) Cost-sensitive feature acquisition and classification. Pattern Recognit 40:1474–1485

    Article  MATH  Google Scholar 

  • Jia X, Liao W, Tang Z, Shang L (2013) Minimum cost attribute reduction in decision-theoretic rough set models. Inf Sci 219:151–167

    Article  MATH  MathSciNet  Google Scholar 

  • Jing SY (2014) A hybrid genetic algorithm for feature subset selection in rough set theory. Soft Comput 18(7):1373–1382

    Article  Google Scholar 

  • Lanzi P (1997) Fast feature selection with genetic algorithms: a filter approach. In: Evolutionary computation

  • Lavrac N, Gamberger D, Turney P (1996) Cost-sensitive feature reduction applied to a hybrid genetic algorithm. In: Proceedings of the 7th international workshop on algorithmic learning theory, ALT

  • Li JH, Mei CL, Xu WH, Qian YH (2015) Concept learning via granular computing: a cognitive viewpoint. Inf Sci 298:447–467

    Article  MathSciNet  Google Scholar 

  • Lin TY (1998) Granular computing on binary relations: data mining and neighborhood systems. In: Rough sets in knowledge discovery

  • Lin TY (2002) Granular computing on binary relations-analysis of conflict and Chinese wall security policy. Proc Rough Sets Curr Trends Comput 2475:296–299

    Article  Google Scholar 

  • Lin TY (2003) Granular computing–structures, representations, and applications. Lect Notes Artif Intell 2639:16–24

    Google Scholar 

  • Ma LW (2012) On some types of neighborhood-related covering rough sets. Int J Approx Reason 53(6):901–911

    Article  MATH  MathSciNet  Google Scholar 

  • Miao DQ, Zhao Y, Yao YY, Li H, Xu F (2009) Relative reducts in consistent and inconsistent decision tables of the Pawlak rough set model. Inf Sci 179(24):4140–4150

    Article  MATH  MathSciNet  Google Scholar 

  • Min F, Liu QH (2009) A hierarchical model for test-cost-sensitive decision systems. Inf Sci 179:2442–2452

    Article  MATH  MathSciNet  Google Scholar 

  • Min F, Zhu W (2012) Attribute reduction of data with error ranges and test costs. Inf Sci 211:48–67

    Article  MATH  MathSciNet  Google Scholar 

  • Min F, He HP, Qian YH, Zhu W (2011) Test-cost-sensitive attribute reduction. Inf Sci 181:4928–4942

    Article  Google Scholar 

  • Min F, Zhu W, Zhao H, Xu ZL (2014) Coser: cost-sensitive rough sets. http://grc.fjzs.edu.cn/~fmin/coser/

  • Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11:341–356

    Article  MATH  MathSciNet  Google Scholar 

  • Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishers, Boston

    Book  MATH  Google Scholar 

  • Pazzani M, Merz C, Ali PMK, Hume T, Brunk C (1994) Reducing misclassification costs. In: Proceedings of the 11th international conference of machine learning (ICML), Morgan Kaufmann

  • Qian J, Lv P, Yue X, Liu C, Jing Z (2015) Hierarchical attribute reduction algorithms for big data using mapreduce. Knowl Based Syst 73:18–31

    Article  Google Scholar 

  • Sanchez MA, Castillo O, Castro JR, Melin P (2014) Fuzzy granular gravitational clustering algorithm for multivariate data. Inf Sci 279:498–511

    Article  MathSciNet  Google Scholar 

  • Sanchez MA, Castillo O, Castro JR (2015) Information granule formation via the concept of uncertainty-based information with interval type-2 fuzzy sets representation and Takagi-Sugeno-Kang consequents optimized with cuckoo search. Appl Soft Comput 27:602–609

    Article  Google Scholar 

  • Susmaga R (1999) Computation of minimal cost reducts. In: Ras Z, Skowron A (eds) Foundations of intelligent systems, vol 1609, pp 448–456

  • Tseng TLB, Huang CC (2007) Rough set-based approach to feature selection in customer relationship management. Omega 35(4):365–383

    Article  Google Scholar 

  • Weiss Y, Elovici Y, Rokach L (2013) The cash algorithm-cost-sensitive attribute selection using histograms. Inf Sci 222:247–268

    Article  MathSciNet  Google Scholar 

  • Yang XB, Yu DJ, Yang JY, Song XN (2009) Difference relation-based rough set and negative rules in incomplete information system. Int J Uncertain Fuzziness Knowl Based Syst 17(05):649–665

    Article  MATH  MathSciNet  Google Scholar 

  • Yao YY (2000) Information tables with neighborhood semantics. In: AeroSense 2000, international society for optics and photonics

  • Yao YY (2004) A partition model of granular computing. Lecture Notes in Computer Science, vol. 3100, pp 232–253

  • Yao YY, Zhao Y (2008) Attribute reduction in decision-theoretic rough set models. Inf Sci 178(17):3356–3373

    Article  MATH  MathSciNet  Google Scholar 

  • Yao YY, Zhong N (2002) Granular computing using information tables. In: Data mining, rough sets and granular computing, pp 102–124

  • Zhang X, Mei C, Chen D, Li J (2013) Multi-confidence rule acquisition oriented attribute reduction of covering decision systems via combinatorial optimization. Knowl Based Syst 50:187–197

    Article  Google Scholar 

  • Zhang X, Mei C, Chen D, Li J (2014) Multi-confidence rule acquisition and confidence-preserved attribute reduction in interval-valued decision systems. Int J Approx Reason 55(8):1787–1804

    Article  MATH  MathSciNet  Google Scholar 

  • Zhao H, Zhu W (2014) Optimal cost-sensitive granularization based on rough sets for variable costs. Knowl Based Syst 65:72–82

    Article  Google Scholar 

  • Zhao H, Min F, Zhu W (2011) Test-cost-sensitive attribute reduction based on neighborhood rough set. In: Proceedings of the 2011 IEEE international conference on granular computing

  • Zhong N, Dong JZ, Ohsuga S (2001) Using rough sets with heuristics to feature selection. J Intell Inf Syst 16(3):199–214

    Article  MATH  Google Scholar 

  • Zhu W (2007) Generalized rough sets based on relations. Inf Sci 177(22):4997–5011

    Article  MATH  MathSciNet  Google Scholar 

  • Zhu W, Wang F (2003) Reduction and axiomatization of covering generalized rough sets. Inf Sci 152(1):217–230

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This work is in part supported by the National Science Foundation of China under Grant Nos. 61379049, 61379089 and 61170128, and the Key Project of Education Department of Fujian Province under Grant Nos. JA13192 and JA14194.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hong Zhao.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, A., Zhao, H. & Zhu, W. Test-cost-sensitive attribute reduction on heterogeneous data for adaptive neighborhood model. Soft Comput 20, 4813–4824 (2016). https://doi.org/10.1007/s00500-015-1770-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-015-1770-x

Keywords

Navigation