Accelerating ReliefF using information granulation

Wei, Wei; Wang, Da; Liang, Jiye

doi:10.1007/s13042-021-01334-4

Accelerating ReliefF using information granulation

Original Article
Published: 28 April 2021

Volume 13, pages 29–38, (2022)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

328 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Feature selection is an essential preprocessing requirement when solving a classification problem. In this respect, the Relief algorithm and its derivatives have been demonstrated to be a class of successful feature selectors. However, the computational cost of these algorithms is very high when large-scale datasets are processed. To solve this problem, we propose the fast ReliefF algorithm based on the information granulation of instances (IG-FReliefF). The algorithm uses K-means to granulate the dataset and selects the significant granules among them using the criteria defined by information entropy and information granulation, and then evaluates each feature on the dataset composed of the selected granules. Extensive experiments show that the proposed algorithm is more efficient than the existing representative algorithms, especially on large-scale data sets, and the proposed algorithm is almost the same as the comparison algorithm in terms of classification performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature selection techniques for machine learning: a survey of more than two decades of research

Article 01 December 2023

Hybrid approaches to optimization and machine learning methods: a systematic literature review

Article Open access 24 January 2024

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

Article Open access 06 November 2019

References

Fawley WJ, PiatetskyShapiro G, Matheus CJ (1992) Knowledge discovery in databases: an overview. Ai Mag 13(3):3–16
Google Scholar
Han JW, Kamber M (2006) Data mining: concepts and techniques. Data Min Conc Mod Methods Algorithms Sec Ed 5(4):1–18
MATH Google Scholar
Zhang C, Li HX, Chen CL, Zhou XZ (2020) Nonnegative representation based discriminant projection for face recognition. Int J Mach Learn Cybern (10)
Li HX, Zhang LB, Huang B, Zhou XZ (2020) Cost-sensitive dual-bidirectional linear discriminant analysis. Inform Sci 510:283–303
Article MathSciNet Google Scholar
Destrero A, Mosci S, Mol CD, Verri A, Odone F (2009) Feature selection for high-dimensional data. Comput Manag Sci 6(1):25–40
Article MathSciNet Google Scholar
Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502
Article MathSciNet Google Scholar
Kira K, Rendell LA (1992) The feature selection problem: traditional methods and a new algorithm. AAAI 2:129–134
Google Scholar
Kononenko I (1994) Estimating attributes: analysis and extensions of Relief. Mach Learn ECML 94:171–182
Robnik M, Kononenko I (2003) Theoretical and empirical analysis of ReliefF and RReliefF. Mach Learn 53(1–2):23–69
Article Google Scholar
Sun YJ (2007) Iterative RELIEF for feature weighting: algorithms, theories, and applications. IEEE Trans Pattern Anal Mach Intell 29(6):1035–1051
Article Google Scholar
Sun YJ, Todorovic S, Goodison S (2008) A feature selection algorithm capable of handling extremely large data dimensionality. In: Proceedings of the SIAM International Conference on Data Mining, Atlanta, Georgia, USA 530–540
Cai H, Ruan P, Ng M, Akutsu T (2014) Feature weight estimation for gene selection: a local hyperlinear learning approach. BMC Bioinform 15(1):1–13
Article Google Scholar
Huang XJ, Zhang L, Wang BJ, Zhang Z, Li FZ (2018) Feature weight estimation based on dynamic representation and neighbor sparse reconstruction. Pattern Recogn 81(9):388–403
Article Google Scholar
Zhang L, Huang XJ, Zhou WD (2019) Logistic local hyperplane-Relief: a feature weighting method for classification. Knowl Based Syst 181:104741
Article Google Scholar
Liu XM, Tang JS, Liu J, Feng ZL (2008) A Semi-Supervised Relief based feature extraction algorithm. In: 2nd International Conference on Future Generation Communication and Networking Symposia. Piscataway NJ: IEEE Computer Society 3:3–6
Cheng YB, Cai YP, Sun YJ, Jian L (2008) Semi-supervised feature selection under logistic I-RELIEF framework. In: IEEE the 19th International Conference on Pattern Recognition. Piscataway NJ: 1–4
Zafra A, Pechenizkiy M, Ventura S (2012) ReliefF-MI: an extension of ReliefF to multiple instance learning. Neurocomputing 75(1):210–218
Article Google Scholar
Song Y, Si WY, Dai FF, Yang GS (2020) Weighted ReliefF with threshold constraints of feature selection for imbalanced data classification. Concurr Comput Pract Exp 32(14):1–13
Article Google Scholar
Kilicarslan S, Adem K, Celik M (2020) Diagnosis and classification of cancer using hybrid model based on ReliefF and convolutional neural network. Med Hypoth 137:109577
Article Google Scholar
Jin LL, Zeng QR, He JZ, Feng YJ, Zhou SQ, Wu Y (2019) A ReliefF-SVM-based method for marking dopamine-based disease characteristics: a study on SWEDD and parkinson‘’s disease. Behav Brain Res 356:400–407
Article Google Scholar
Praveena HD, Subhas C, Naidu KR (2020) Automatic epileptic seizure recognition using ReliefF feature selection and long short term memory classifier. J Ambient Intell Hum Comput.
Wang Z, Zhang Y, Chen ZC, Yang H, Sun YX, Kang JM, Yang Y, Liang XJ (2016) Application of ReliefF algorithm to selecting feature sets for classification of high resolution remote sensing image. In: 2016 IEEE International Geoscience and Remote Sensing Symposium 755–758
Dou DY, Wu WZ, Yang JG, Zhang Y (2019) Classification of coal and gangue under multiple surface conditions via machine vision and Relief-SVM. Powder Technol 356:1024–1028
Article Google Scholar
Zhou ZB, Wang YF, He XR, Zhang XC (2020) Optimization of random forests algorithm based on ReliefF-SA. IOP Conf Ser Mater Sci Eng 768:072065
Article Google Scholar
Baskar SS, Arockiam L (2014) C-LAS Relief-An improved feature selection technique in data mining. Int J Comput Appl 83(13):33–36
Google Scholar
Liu Y, Tang F, Zeng Z (2015) Feature selection based on dependency margin. IEEE Trans Cybern 45(6):1209–1221
Article Google Scholar
Shi SB, Li GN, Chen HX, Liu JY, Hu YP, Xing L, Hu WJ (2017) Refrigerant charge fault diagnosis in the VRF system using bayesian artificial neural network combined with ReliefF filter. Appl Thermal Eng 112:698–706
Article Google Scholar
Huang Y, Mccullagh PJ, Black ND (2009) An optimization of ReliefF for classification in large datasets. Data Knowl Eng 68(11):1348–1356
Article Google Scholar
Yao YY (2009) Interpreting concept learning in cognitive informatics and granular computing. IEEE Trans Syst Man Cybern Part B 39(4):855–866
Article Google Scholar
Niu JJ, Huang CC, Li JH, Fan M (2018) Parallel computing techniques for concept-cognitive learning based on granular computing. Int J Mach Learn Cybern 9(11):1785–1805
Article Google Scholar
Mi YL, Shi Y, Li JH, Liu WQ, Yan MY (2020) Fuzzy-based concept learning method: exploiting data with fuzzy conceptual clustering. IEEE Tran Cybern 42(1):1–12
Google Scholar
Yao YY (2020) Tri-level thinking: models of three-way decision. Int J Mach Learn Cybern 11:947–959
Article Google Scholar
Liu D, Yang X, Li TR (2020) Three-way decisions: beyond rough sets and granular computing. Int J Mach Learn Cybern 11:989–1002
Article Google Scholar
Wierman MJ (1999) Measuring uncertainty in rough set theory. Int J Gen Syst 28(4–5):283–297
Article MathSciNet Google Scholar
Liang JY, Qian YH (2008) Information granules and entropy theory in information systems. Sci China (Ser F Inform Sci ) 10:29–46
Qian YH, Liang JY (2008) Combination entropy and combination granulation in rough set theory. Int J Uncert Fuz Knowl Based Syst 16(2):179–193
Article MathSciNet Google Scholar
Qian YH, Liang JY, Wu WZ et al (2011) Information granularity in fuzzy binary GrC model. IEEE Trans Fuzzy Syst 19(2):253–264
Article Google Scholar
Beaubouef T, Petry FE, Arora G (1998) Information-theoretic measures of uncertainty for rough sets and rough relational databases. Inform Sci 109(1):185–195
Article Google Scholar
Bai L, Chen XQ, Liang JY, Shen HW, Guo YK (2017) Fast density clustering strategies based on the k-means algorithm. Pattern Recogn 71:375–386
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Nos. 61772323, 61976184, 61876103).

Author information

Authors and Affiliations

Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, School of Computer and Information Technology, Shanxi University, Taiyuan, 030006, Shanxi, China
Wei Wei, Da Wang & Jiye Liang

Authors

Wei Wei
View author publications
You can also search for this author in PubMed Google Scholar
Da Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jiye Liang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiye Liang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, W., Wang, D. & Liang, J. Accelerating ReliefF using information granulation. Int. J. Mach. Learn. & Cyber. 13, 29–38 (2022). https://doi.org/10.1007/s13042-021-01334-4

Download citation

Received: 03 September 2020
Accepted: 13 April 2021
Published: 28 April 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s13042-021-01334-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerating ReliefF using information granulation

Abstract

Access this article

Similar content being viewed by others

Feature selection techniques for machine learning: a survey of more than two decades of research

Hybrid approaches to optimization and machine learning methods: a systematic literature review

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Accelerating ReliefF using information granulation

Abstract

Access this article

Similar content being viewed by others

Feature selection techniques for machine learning: a survey of more than two decades of research

Hybrid approaches to optimization and machine learning methods: a systematic literature review

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation