Structural risk minimization of rough set-based classifier

Liu, Jinfu; Bai, Mingliang; Jiang, Na; Yu, Daren

doi:10.1007/s00500-019-04038-8

Structural risk minimization of rough set-based classifier

Methodologies and Application
Published: 13 May 2019

Volume 24, pages 2049–2066, (2020)
Cite this article

Soft Computing Aims and scope Submit manuscript

Jinfu Liu¹,
Mingliang Bai¹,
Na Jiang¹ &
…
Daren Yu¹

364 Accesses
19 Citations
Explore all metrics

Abstract

The classification ability in unseen objects, namely generalization ability, remains a long-standing challenge in rough set-based classifier. Current research mainly focuses on introducing thresholds to tolerate some errors in seen objects. The reason for introducing thresholds and the selection of threshold still lack sufficient theoretical support. The structural risk minimization (SRM) inductive principle is one of the most effective theories to control the generalization ability, which suggests a trade-off between errors in seen objects and complexity. Therefore, this paper introduces the SRM principle into rough set-based classifier and proposes SRM algorithm of rough set-based classifier called SRM-R algorithm. SRM-R algorithm uses the number of rules to characterize the actual complexity of rough set-based classifier and obtains the optimal trade-off between errors in seen objects and complexity through genetic multi-objective optimization. The tenfold cross-validation experiment in 12 UCI datasets shows SRM-R algorithm can significantly improve the generalization ability compared with conventional threshold algorithm. Besides, this paper uses other two possible complexity metrics including the number of attributes and attribute space to construct corresponding SRM algorithms, respectively, and compared their classification accuracy with that of SRM-R algorithm. Comparison result shows SRM-R algorithm obtains optimal classification accuracy. This indicates that the number of rules characterizes the complexity more effectively than the number of attributes and attribute space. Further experiments show that SRM-R algorithm obtains fewer rules and larger support coefficient, which means it extracts stronger rules. This explains why it obtains better generalization ability to some extent.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Heart disease prediction using distinct artificial intelligence techniques: performance analysis and comparison

Article 12 June 2023

A machine learning based credit card fraud detection using the GA algorithm for feature selection

Article Open access 25 February 2022

Analysis of Breast Cancer Detection Using Different Machine Learning Techniques

References

Abualigah LMQ (2019) Feature selection and enhanced Krill Herd algorithm for text document clustering. Springer, Berlin
Book Google Scholar
Abualigah LMQ, Hanandeh ES (2015) Applying genetic algorithms to information retrieval using vector space model. Int J Comput Sci Eng Appl 5:19
Google Scholar
Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput 73:4773–4795
Article Google Scholar
Abualigah LM, Khader AT, Hanandeh ES (2018a) A combination of objective functions and hybrid Krill herd algorithm for text document clustering analysis. Eng Appl Artif Intell 73:111–125
Article Google Scholar
Abualigah LM, Khader AT, Hanandeh ES (2018b) Hybrid clustering analysis using improved krill herd algorithm. Appl Intell 48:4047–4071
Article Google Scholar
Abualigah LM, Khader AT, Hanandeh ES (2018c) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J Comput Sci 25:456–466
Article Google Scholar
Barman T, Ghongade R, Ratnaparkhi A (2016) Rough set based segmentation and classification model for ECG. In: 2016 conference on advances in signal processing (CASP). IEEE, pp 18–23
Bazan JG, Nguyen HS, Nguyen SH, Synak P, Wróblewski J (2000) Rough set algorithms in classification problem. In: Polkowski L, Tsumoto S, Lin TY (eds) Rough set methods and applications. Studies in fuzziness and soft computing, vol 56. Physica, Heidelberg
Google Scholar
Carlos F et al (2016) Regularization techniques for ECG imaging during atrial fibrillation: a computational study. Front Physiol 7:466
Google Scholar
Cekik R, Telceken S (2016) A new classification method based on rough sets theory. Soft Comput 22:1881–1889. https://doi.org/10.1007/s00500-016-2443-0
Article Google Scholar
Chen YM, Xue Y, Ma Y, Xu FF (2017) Measures of uncertainty for neighborhood rough sets. Knowl Based Syst 120:226–235. https://doi.org/10.1016/j.knosys.2017.01.008
Article Google Scholar
Cheng YS, Zhan WF, Wu XD, Zhang YZ (2015) Automatic determination about precision parameter value based on inclusion degree with variable precision rough set model. Inf Sci 290:72–85. https://doi.org/10.1016/j.ins.2014.08.034
Article MathSciNet MATH Google Scholar
Coello CA (1998) An updated survey of GA-based multiobjective optimization techniques. In: ACM computing surveys, pp 109–143
Das RT, Ang KK, Quek C (2016) ieRSPOP: a novel incremental rough set-based pseudo outer-product with ensemble learning. Appl Soft Comput 46:170–186. https://doi.org/10.1016/j.asoc.2016.04.015
Article Google Scholar
Derrac J, Garcia S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol Comput 1:3–18. https://doi.org/10.1016/j.swevo.2011.02.002
Article Google Scholar
Fang H, Wang Q, Tu YC, Horstemeyer MF (2008) An efficient non-dominated sorting method for evolutionary algorithms. Evol Comput 16:355–384
Article Google Scholar
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675–701. https://doi.org/10.1080/01621459.1937.10503522
Article MATH Google Scholar
Grzymala-Busse JW (1992) LERS-a system for learning from examples based on rough sets. In: Slowinski R (ed) Intelligent decision support. Theory and decision library (Series D: System theory, knowledge engineering and problem solving), vol 11. Springer, Dordrecht
Google Scholar
Halder B, Mitra S, Mitra M (2019) Classification of complete myocardial infarction using rule-based rough set method and rough set explorer system. IETE J Res 1–11. https://doi.org/10.1080/03772063.2019.1588175
Hedar AR, Omar MA, Sewisy AA (2015) Rough sets attribute reduction using an accelerated genetic algorithm. In: IEEE/ACIS international conference on software engineering, artificial intelligence, networking and parallel/distributed computing, pp 1–7
Holland H (1975) Adaption in natural and artificial systems. Q Rev Biol 6:126–137
Google Scholar
Hong-Wei Y, Xindi T (2016) Based on rough sets and L1 regularization of the fault diagnosis of linear regression model. In: Paper presented at the 2016 international conference on intelligent transportation, big data and smart city (ICITBS)
Jeon G, Anisetti M, Damiani E, Monga O (2018) Real-time image processing systems using fuzzy and rough sets techniques. Soft Comput 22:1381–1384. https://doi.org/10.1007/s00500-017-2999-3
Article Google Scholar
Jia X, Shang L, Zhou B, Yao Y (2016) Generalized attribute reduct in rough set theory. Knowl Based Syst 91:204–218. https://doi.org/10.1016/j.knosys.2015.05.017
Article Google Scholar
Jiang Y, Yu Y (2016) Minimal attribute reduction with rough set based on compactness discernibility information tree. Soft Comput 20:2233–2243. https://doi.org/10.1007/s00500-015-1638-0
Article Google Scholar
Kadzinski M, Slowinski R, Greco S (2015) Multiple criteria ranking and choice with all compatible minimal cover sets of decision rules. Knowl Based Syst 89:569–583. https://doi.org/10.1016/j.knosys.2015.09.004
Article Google Scholar
Kim DE (2006) Minimizing structural risk on decision tree classification. Springer, Berlin
Book Google Scholar
Liu J, Hu Q, Yu D (2007) Weighted rough set learning: towards a subjective approach. In: Pacific-Asia conference on advances in knowledge discovery and data mining, pp 696–703
Liu JF, Hu QH, Yu DR (2008) A weighted rough set based method developed for class imbalance learning. Inf Sci 178:1235–1256. https://doi.org/10.1016/j.ins.2007.10.002
Article MathSciNet MATH Google Scholar
Liu D, Qian H, Dai G, Zhang Z (2013) An iterative SVM approach to feature selection and classification in high-dimensional datasets. Pattern Recognit 46:2531–2537. https://doi.org/10.1016/j.patcog.2013.02.007
Article MATH Google Scholar
Luo J, Wei C, Dai H, Yuan J (2018) Robust LS-SVM-based adaptive constrained control for a class of uncertain nonlinear systems with time-varying predefined performance. Commun Nonlinear Sci Numer Simul 56:561–587. https://doi.org/10.1016/j.cnsns.2017.09.004
Article MathSciNet Google Scholar
Ma BT, Xia Y (2017) A tribe competition-based genetic algorithm for feature selection in pattern classification. Appl Soft Comput 58:328–338. https://doi.org/10.1016/j.asoc.2017.04.042
Article Google Scholar
Min F, Du X, Qiu H, Liu Q (2007) Minimal attribute space bias for attribute reduction. In: Rough sets and knowledge technology, second international conference, RSKT 2007, Toronto, Canada, May 14–16, 2007, Proceedings, pp 379–386
Nong J (2011) The Design of RBF Neural Networks and experimentation for solving overfitting problem. In: International conference on electronics and optoelectronics, pp V1-75–V71-78
Nyathi T, Pillay N (2017) Automated design of genetic programming classification algorithms using a genetic algorithm. In: Squillero G, Sim K (eds) Applications of evolutionary computation, vol 10200. Lecture notes in computer science. Springer, Cham, pp 224–239. https://doi.org/10.1007/978-3-319-55792-2_15
Chapter Google Scholar
Pareek NK, Patidar V (2016) Medical image protection using genetic algorithm operations. Soft Comput 20:763–772. https://doi.org/10.1007/s00500-014-1539-7
Article Google Scholar
Pawlak Z (2002) Rough sets and intelligent data analysis. Inf Sci 147:1–12. https://doi.org/10.1016/s0020-0255(02)00197-4
Article MathSciNet MATH Google Scholar
Pawlak Z, Skowron A (2007) Rudiments of rough sets. Inf Sci 177:3–27. https://doi.org/10.1016/j.ins.2006.06.003
Article MathSciNet MATH Google Scholar
Queiroga E, Subramanian A, dos Anjos F, Cabral L (2018) Continuous greedy randomized adaptive search procedure for data clustering. Appl Soft Comput 72:43–55. https://doi.org/10.1016/j.asoc.2018.07.031
Article Google Scholar
Rissanen J (1978) Modeling by shortest data description. Automatica 14:465–471. https://doi.org/10.1016/0005-1098(78)90005-5
Article MATH Google Scholar
Sahoo S, Jha MK (2017) Pattern recognition in lithology classification: modeling using neural networks, self-organizing maps and genetic algorithms. Hydrogeol J 25:311–330. https://doi.org/10.1007/s10040-016-1478-8
Article Google Scholar
Valsecchi A, Damas S, Santamaria J, IEEE (2012) An image registration approach using genetic algorithms. In: 2012 IEEE congress on evolutionary computation
Sheta A, Braik MS, Aljahdali S (2012) Genetic algorithms: a tool for image segmentation. In: Essaaidi M, Zaz Y (eds) 2012 international conference on multimedia computing and systems, pp 83–89
Stefanowski J (1998) On rough set based approaches to induction of decision rules. Rough Sets Knowl Discov 1:500–529
MATH Google Scholar
Teng S, Liao F, Ma Y, He M, Nian Y (2017) Uncertainty measures of rough sets based on discernibility capability in information systems. Soft Comput 21:1081–1096. https://doi.org/10.1007/s00500-016-2481-7
Article MATH Google Scholar
Vapnik V (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10:988–999
Article Google Scholar
Vapnik V (2013) The nature of statistical learning theory. Springer, Berlin
MATH Google Scholar
Vapnik V, Chervonenkis A (1974) Theory of pattern recognition. Nauka, Moscow (in Russian)
MATH Google Scholar
Vieira DAG, Vasconcelos JA, Saldanha RR (2010) Recent advances in neural networks structural risk minimization based on multiobjective complexity control algorithms. InTech
Wang Z, Chu L (2010) The algorithm of text classification based on rough set and support vector machine. In: International conference on future computer and communication, pp V1-365–V361-368
Wang Z-M, Han N, Yuan Z-M, Wu Z-H (2013) Feature selection for high-dimensional data based on ridge regression and SVM and its application in peptide QSAR modeling. Acta Phys Chim Sin 29:498–507. https://doi.org/10.3866/pku.whxb201301042
Article Google Scholar
Wang CZ, Shao MW, He Q, Qian YH, Qi YL (2016) Feature subset selection based on fuzzy neighborhood rough sets. Knowl Based Syst 111:173–179. https://doi.org/10.1016/j.knosys.2016.08:009
Article Google Scholar
Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1:80–83
Article Google Scholar
Xu WH, Guo YT (2016) Generalized multigranulation double-quantitative decision-theoretic rough set. Knowl Based Syst 105:190–205. https://doi.org/10.1016/j.knosys.2016.05.021
Article Google Scholar
Yang ZM, Chai Y, Chen T, Qu JF (2017) Smoothed l(1)-regularization-based line search for sparse signal recovery. Soft Comput 21:4813–4828. https://doi.org/10.1007/s00500-016-2423-4
Article MATH Google Scholar
Ye D, Chen Z (2015) A new approach to minimum attribute reduction based on discrete artificial bee colony. Soft Comput 19:1893–1903. https://doi.org/10.1007/s00500-014-1371-0
Article Google Scholar
Yildiz OT (2015) VC-dimension of univariate decision trees. IEEE Trans Neural Netw Learn Syst 26:378
Article MathSciNet Google Scholar
Zhan J, Ali MI, Mehmood N (2017) On a novel uncertain soft set model: Z-soft fuzzy rough set model and corresponding decision making methods. Appl Soft Comput 56:446–457. https://doi.org/10.1016/j.asoc.2017.03.038
Article Google Scholar
Zhao XR, Hu BQ (2016) Fuzzy probabilistic rough sets and their corresponding three-way decisions. Knowl Based Syst 91:126–142. https://doi.org/10.1016/j.knosys.2015.09.018
Article Google Scholar
Zhao J, Zhang Z, Han C, Zhou Z (2015) Complement information entropy for uncertainty measure in fuzzy rough set and its applications. Soft Comput 19:1997–2010. https://doi.org/10.1007/s00500-014-1387-5
Article MATH Google Scholar
Zhao H, Wang P, Hu QH (2016) Cost-sensitive feature selection based on adaptive neighborhood granularity with multi-level confidence. Inf Sci 366:134–149. https://doi.org/10.1016/j.ins.2016.05.025
Article MathSciNet Google Scholar
Zhao W, Xu L, Bai J, Ji M, Runge T (2018) Sensor-based risk perception ability network design for drivers in snow and ice environmental freeway: a deep learning and rough sets approach. Soft Comput 22:1457–1466. https://doi.org/10.1007/s00500-017-2850-x
Article Google Scholar
Zheng L, Diao R, Shen Q (2015) Self-adjusting harmony search-based feature selection. Soft Comput 19:1567–1579. https://doi.org/10.1007/s00500-014-1307-8
Article Google Scholar
Zhou J, Miao D, Feng Q, Sun L (2009) Research on complete algorithms for minimal attribute reduction. In: Rough sets and knowledge technology, international conference, RSKT 2009, Gold Coast, Australia, July 14–16, 2009. Proceedings, pp 152–159
Zhou P, Hu XG, Li PP, Wu XD (2017) Online feature selection for high-dimensional class-imbalanced data. Knowl Based Syst 136:187–199. https://doi.org/10.1016/j.knosys.2017.09.006
Article Google Scholar
Zhu X-Z, Zhu W, Fan X-N (2017) Rough set methods in feature selection via submodular function. Soft Comput 21:3699–3711. https://doi.org/10.1007/s00500-015-2024-7
Article MATH Google Scholar
Ziarko W (1993) Variable precision rough set model. J Comput Syst Sci 46:39–59
Article MathSciNet Google Scholar
Zitzler E, Thiele L (1998) An evolutionary algorithm for multiobjective optimization: the strength pareto approach

Download references

Acknowledgements

This work was supported by National Key R&D Program of China No. 2017YFB0902100 and National Science and Technology Major Project of China No. 2017-I-0007-0008. The authors would like to thank the anonymous reviewers for their careful reading of the paper and valuable suggestions to refine this work.

Author information

Authors and Affiliations

School of Energy Science and Engineering, Harbin Institute of Technology, Harbin, 150001, Heilongjiang, China
Jinfu Liu, Mingliang Bai, Na Jiang & Daren Yu

Authors

Jinfu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Mingliang Bai
View author publications
You can also search for this author in PubMed Google Scholar
Na Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Daren Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinfu Liu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interests regarding the publication of this article.

Human and animal rights statement

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, J., Bai, M., Jiang, N. et al. Structural risk minimization of rough set-based classifier. Soft Comput 24, 2049–2066 (2020). https://doi.org/10.1007/s00500-019-04038-8

Download citation

Published: 13 May 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s00500-019-04038-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Structural risk minimization of rough set-based classifier

Abstract

Access this article

Similar content being viewed by others

Heart disease prediction using distinct artificial intelligence techniques: performance analysis and comparison

A machine learning based credit card fraud detection using the GA algorithm for feature selection

Analysis of Breast Cancer Detection Using Different Machine Learning Techniques

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and animal rights statement

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Structural risk minimization of rough set-based classifier

Abstract

Access this article

Similar content being viewed by others

Heart disease prediction using distinct artificial intelligence techniques: performance analysis and comparison

A machine learning based credit card fraud detection using the GA algorithm for feature selection

Analysis of Breast Cancer Detection Using Different Machine Learning Techniques

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and animal rights statement

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation