Skip to main content

Sensitivity analysis of feature weighting for classification

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Feature weighting is a well-known approach for improving the performance of machine learning algorithms that has been gaining a lot of traction recently. It rescales the feature space such that models of learning algorithms fit better. Although it helps in obtaining high performance in numerous applications, the sensitivity of machine learning algorithms to the weighting, which depends upon their learning mechanisms, has not been explored yet. Such analysis is essential for the practical use of weighting in boosting performance. Therefore, this work presents a empirical assessment of four popular machine learning algorithms which are sensitive to the changes in the feature space. This assessment determines the improvements in performance with weighted features compared to unweighted ones and also identifies the best learning algorithms. The wrapper approach utilizing whale optimization algorithm is used to search for the best feature weights and the parameters of classifiers. The outcomes of experiments combined with the learning mechanism of classifiers show the high sensitivity with k-NN, MLP, and SVM with RBF kernel classifiers while the NB classifier shows the least sensitivity. In terms of practical use, k-NN and SVM-RBF are the best choices for applications demanding accurate predictions, whereas Naive Bayes is the best choice for applications requiring minimal time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Availability of data and materials

Data used in the study are obtained from public resources. The source code is available at https://github.com/dalwindercheema/fwcomp.

References

  1. Phan AV, Le Nguyen M, Bui LT (2017) Feature weighting and svm parameters optimization based on genetic algorithms for classification problems. Appl Intell 46(2):455–469. https://doi.org/10.1007/s10489-016-0843-6

    Article  Google Scholar 

  2. Serrano-Silva YO, Villuendas-Rey Y, Yáñez-Márquez C (2018) Automatic feature weighting for improving financial decision support systems. Decis Support Syst 107:78–87. https://doi.org/10.1016/j.dss.2018.01.005

    Article  Google Scholar 

  3. Lee CH (2015) A gradient approach for value weighted classification learning in Naive Bayes. Knowl Based Syst 85:71–79. https://doi.org/10.1016/j.knosys.2015.04.020

    Article  Google Scholar 

  4. Wettschereck D, Aha DW, Mohri T (1997) A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artif Intell Rev 11(1–5):273–314. https://doi.org/10.1023/A:1006593614256

    Article  Google Scholar 

  5. Jiang L, Li C, Wang S, Zhang L (2016) Deep feature weighting for Naive Bayes and its application to text classification. Eng Appl Artif Intell 52:26–39. https://doi.org/10.1016/j.engappai.2016.02.002

    Article  Google Scholar 

  6. Zhang L, Jiang L, Li C, Kong G (2016) Two feature weighting approaches for Naive Bayes text classifiers. Knowl Based Syst 100:137–144. https://doi.org/10.1016/j.knosys.2016.02.017

    Article  Google Scholar 

  7. Lv X, Duan F (2018) Metric learning via feature weighting for scalable image retrieval. Pattern Recogn Lett 109:97–102. https://doi.org/10.1016/j.patrec.2017.09.026

    Article  Google Scholar 

  8. Sotoodeh M, Moosavi MR, Boostani R (2019) A novel adaptive lbp-based descriptor for color image retrieval. Expert Syst Appl 127:342–352. https://doi.org/10.1016/j.eswa.2019.03.020

    Article  Google Scholar 

  9. Ghodratnama S, Moghaddam HA (2020) Content-based image retrieval using feature weighting and c-means clustering in a multi-label classification framework. Pattern Anal Appl. https://doi.org/10.1007/s10044-020-00887-4

    Article  Google Scholar 

  10. Saha S, Acharya S, Kavya K, Miriyala S (2017) Simultaneous clustering and feature weighting using multiobjective optimization for identifying functionally similar mirnas. IEEE J Biomed Health Inform 22(5):1684–1690. https://doi.org/10.1109/JBHI.2017.2784898

    Article  Google Scholar 

  11. Singh D, Singh B, Kaur M (2020) Simultaneous feature weighting and parameter determination of neural networks using ant lion optimization for the classification of breast cancer. Biocybern Biomed Eng 40(1):337–351. https://doi.org/10.1016/j.bbe.2019.12.004

    Article  Google Scholar 

  12. Cai H, Qu Z, Li Z, Zhang Y, Hu X, Hu B (2020) Feature-level fusion approaches based on multimodal eeg data for depression recognition. Inf Fus 59:127–138. https://doi.org/10.1016/j.inffus.2020.01.008

    Article  Google Scholar 

  13. Singh D, Singh B (2020) Effective and efficient classification of gastrointestinal lesions: combining data preprocessing, feature weighting, and improved ant lion optimization. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-020-02629-0

    Article  Google Scholar 

  14. Singh D, Singh B (2018) Feature weighting for improved classification of anuran calls. In: First international conference on secure cyber computing and communication. IEEE, pp 604–609. https://doi.org/10.1109/ICSCCC.2018.8703371

  15. Eroglu DY, Kilic K (2017) A novel hybrid genetic local search algorithm for feature selection and weighting with an application in strategic decision making in innovation management. Inf Sci 405:18–32. https://doi.org/10.1016/j.ins.2017.04.009

    Article  Google Scholar 

  16. Fahad LG, Tahir SF (2020) Activity recognition in a smart home using local feature weighting and variants of nearest-neighbors classifiers. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-020-02348-6

    Article  Google Scholar 

  17. Ali W, Malebary S (2020) Particle swarm optimization-based feature weighting for improving intelligent phishing website detection. IEEE Access 8:116766–116780. https://doi.org/10.1109/ACCESS.2020.3003569

    Article  Google Scholar 

  18. Alimi OA, Ouahada K, Abu-Mahfouz AM, Rimer S (2021) Power system events classification using genetic algorithm based feature weighting technique for support vector machine. Heliyon 7(1):e05936. https://doi.org/10.1016/j.heliyon.2021.e05936

    Article  Google Scholar 

  19. Kelly JD Jr, Davis L (1991) A hybrid genetic algorithm for classification. IJCAI 91:645–650

    MATH  Google Scholar 

  20. AlSukker A, Khushaba R, Al-Ani A (2010) Optimizing the k-nn metric weights using differential evolution. In: International conference on multimedia computing and information technology (MCIT). IEEE, pp 89–92. https://doi.org/10.1109/MCIT.2010.5444845

  21. Wu J, Cai Z (2011) Attribute weighting via differential evolution algorithm for attribute weighted Naive Bayes (wnb). J Comput Inf Syst 7(5):1672–1679

    Google Scholar 

  22. Singh D, Singh B (2019) Hybridization of feature selection and feature weighting for high dimensional data. Appl Intell 49(4):1580–1596. https://doi.org/10.1007/s10489-018-1348-2

    Article  Google Scholar 

  23. Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of relieff and rrelieff. Mach Learn 53(1–2):23–69. https://doi.org/10.1023/A:1025667309714

    Article  MATH  Google Scholar 

  24. Hall M (2006) A decision tree-based attribute weighting filter for Naive Bayes. In: International conference on innovative techniques and applications of artificial intelligence. Springer, pp 59–70, https://doi.org/10.1007/978-1-84628-663-6_5

  25. Paul S, Das S (2015) Simultaneous feature selection and weighting—an evolutionary multi-objective optimization approach. Pattern Recogn Lett 65:51–59. https://doi.org/10.1016/j.patrec.2015.07.007

    Article  Google Scholar 

  26. Pérez-Rodríguez J, Arroyo-Peña AG, García-Pedrajas N (2015) Simultaneous instance and feature selection and weighting using evolutionary computation: proposal and study. Appl Soft Comput 37:416–443. https://doi.org/10.1016/j.asoc.2015.07.046

    Article  Google Scholar 

  27. Jiang L, Zhang L, Yu L, Wang D (2019) Class-specific attribute weighted Naive Bayes. Pattern Recogn 88:321–330. https://doi.org/10.1016/j.patcog.2018.11.032

    Article  Google Scholar 

  28. Hussein F, Kharma N, Ward R (2001) Genetic algorithms for feature selection and weighting, a review and study. In: Proceedings of sixth international conference on document analysis and recognition. IEEE, pp 1240–1244. https://doi.org/10.1109/ICDAR.2001.953980

  29. Zhang H, Sheng S (2004) Learning weighted Naive Bayes with accurate ranking. In: Fourth IEEE international conference on data mining (ICDM’04). IEEE, pp 567–570. https://doi.org/10.1109/ICDM.2004.10030

  30. Duan B, Pao YH (2005) Iterative feature weighting for identification of relevant features with radial basis function networks. In: IEEE international joint conference on neural networks, vol 2. IEEE, pp 1063–1068. https://doi.org/10.1109/IJCNN.2005.1556000

  31. Ng WW, Wang QC, Yang RJ, Chan PP, Yeung DS (2011) A study on the effect of scaling functions to feature weighting performance. In: International conference on machine learning and cybernetics, vol 3. IEEE, pp 1077–1081. https://doi.org/10.1109/ICMLC.2011.6016930

  32. Tahir MA, Bouridane A, Kurugollu F (2007) Simultaneous feature selection and feature weighting using hybrid tabu search/k-nearest neighbor classifier. Pattern Recogn Lett 28(4):438–446. https://doi.org/10.1016/j.patrec.2006.08.016

    Article  Google Scholar 

  33. Lee CH, Gutierrez F, Dou D (2011) Calculating feature weights in naive bayes with kullback-leibler measure. In: 2011 IEEE 11th international conference on data mining. IEEE, pp 1146–1151. https://doi.org/10.1109/ICDM.2011.29

  34. Zaidi NA, Cerquides J, Carman MJ, Webb GI (2013) Alleviating Naive Bayes attribute independence assumption by attribute weighting. J Mach Learn Res 14(24):1947–1988

    MathSciNet  MATH  Google Scholar 

  35. Mateos-García D, García-Gutiérrez J, Riquelme-Santos JC (2019) On the evolutionary weighting of neighbours and features in the k-nearest neighbour rule. Neurocomputing 326:54–60. https://doi.org/10.1016/j.neucom.2016.08.159

    Article  Google Scholar 

  36. Bugata P, Drotár P (2019) Weighted nearest neighbors feature selection. Knowl Based Syst 163:749–761. https://doi.org/10.1016/j.knosys.2018.10.004

    Article  Google Scholar 

  37. Jiang L, Zhang L, Li C, Wu J (2018) A correlation-based feature weighting filter for Naive Bayes. IEEE Trans Knowl Data Eng 31(2):201–213. https://doi.org/10.1109/TKDE.2018.2836440

    Article  Google Scholar 

  38. Singh D, Singh B (2020) Investigating the impact of data normalization on classification performance. Appl Soft Comput 97:105524. https://doi.org/10.1016/j.asoc.2019.105524

    Article  Google Scholar 

  39. Sharma M, Kumar N, Kumar P et al (2020) Badminton match outcome prediction model using Naïve Bayes and feature weighting technique. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-020-02578-8

    Article  Google Scholar 

  40. Ala’M AZ, Heidari AA, Habib M, Faris H, Aljarah I, Hassonah MA (2020) Salp chain-based optimization of support vector machines and feature weighting for medical diagnostic information systems. In: Evolutionary machine learning techniques. Springer, pp 11–34. https://doi.org/10.1007/978-981-32-9990-0_2

  41. Ala’M AZ, Hassonah MA, Heidari AA, Faris H, Mafarja M, Aljarah I (2021) Evolutionary competitive swarm exploring optimal support vector machines and feature weighting. Soft Comput. https://doi.org/10.1007/s00500-020-05439-w

    Article  Google Scholar 

  42. Mitchell T (1997) Machine learning. McGraw-Hill international editions - computer science series. McGraw-Hill Education, New York

    Google Scholar 

  43. Dietterich TG (2002) Machine learning for sequential data: a review. In: Joint IAPR international workshops on statistical techniques in pattern recognition (SPR) and structural and syntactic pattern recognition (SSPR). Springer, pp 15–30. https://doi.org/10.1007/3-540-70659-3_2

  44. Kotsiantis SB, Zaharakis I, Pintelas P (2007) Supervised machine learning: a review of classification techniques. Emerg Artif Intell Appl Comput Eng 160:3–24

    Google Scholar 

  45. Zhang GP (2000) Neural networks for classification: a survey. IEEE Trans Syst Man Cybern Part C (Appl Rev) 30(4):451–462. https://doi.org/10.1109/5326.897072

    Article  Google Scholar 

  46. Cover TM, Hart P et al (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27. https://doi.org/10.1109/TIT.1967.1053964

    Article  MATH  Google Scholar 

  47. Chuang LY, Tsai SW, Yang CH (2011) Improved binary particle swarm optimization using catfish effect for feature selection. Expert Syst Appl 38(10):12699–12707. https://doi.org/10.1016/j.eswa.2011.04.057

    Article  Google Scholar 

  48. Bishop CM (2006) Pattern recognition and machine learning. Springer, Berlin

    MATH  Google Scholar 

  49. Chakraborty D, Pal NR (2003) A novel training scheme for multilayered perceptrons to realize proper generalization and incremental learning. IEEE Trans Neural Netw 14(1):1–14. https://doi.org/10.1109/TNN.2002.806953

    Article  Google Scholar 

  50. Zhang H (2004) The optimality of Naive Bayes. AA 1(2):3

    Google Scholar 

  51. Kim SB, Han KS, Rim HC, Myaeng SH (2006) Some effective techniques for Naive Bayes text classification. IEEE Trans Knowl Data Eng 18(11):1457–1466. https://doi.org/10.1109/TKDE.2006.180

    Article  Google Scholar 

  52. John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the eleventh conference on Uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc., pp 338–345

  53. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297. https://doi.org/10.1007/BF00994018

    Article  MATH  Google Scholar 

  54. Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):27. https://doi.org/10.1145/1961189.1961199

    Article  Google Scholar 

  55. Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67. https://doi.org/10.1016/j.advengsoft.2016.01.008

    Article  Google Scholar 

  56. Aljarah I, Faris H, Mirjalili S (2018) Optimizing connection weights in neural networks using the whale optimization algorithm. Soft Comput 22(1):1–15. https://doi.org/10.1007/s00500-016-2442-1

    Article  Google Scholar 

  57. Raei B, Ahmadi A, Neyshaburi MR, Ghorbani MA, Asadzadeh F (2021) Comparative evaluation of the whale optimization algorithm and backpropagation for training neural networks to model soil wind erodibility. Arab J Geosci 14(1):1–19. https://doi.org/10.1007/s12517-020-06328-0

    Article  Google Scholar 

  58. Khishe M, Mosavi M (2019) Improved whale trainer for sonar datasets classification using neural network. Appl Acoust 154:176–192. https://doi.org/10.1016/j.apacoust.2019.05.006

    Article  Google Scholar 

  59. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  60. Asuncion A, Newman D (2007) UCI machine learning repository. Available: http://archive.ics.uci.edu/ml/, 12 Sept 2019

  61. Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, Liu H (2018) Feature selection: a data perspective. ACM Comput Surv 50(6):94. https://doi.org/10.1145/3136625

    Article  Google Scholar 

  62. Liu DY, Chen HL, Yang B, Lv XE, Li LN, Liu J (2012) Design of an enhanced fuzzy k-nearest neighbor classifier based computer aided diagnostic system for thyroid disease. J Med Syst 36(5):3243–3254. https://doi.org/10.1007/s10916-011-9815-x

    Article  Google Scholar 

  63. Masters T (2014) Practical neural network recipies in C++. Elsevier Science, Amsterdam

    Google Scholar 

  64. Shibata K, Ikeda Y (2009) Effect of number of hidden neurons on learning in large-scale layered neural networks. In: ICCAS-SICE. IEEE, pp 5008–5013

  65. Lin SW, Ying KC, Chen SC, Lee ZJ (2008) Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst Appl 35(4):1817–1824. https://doi.org/10.1016/j.eswa.2007.08.088

    Article  Google Scholar 

  66. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  67. Bzdok D, Krzywinski M, Altman N (2018) Machine learning: supervised methods. Nat Methods. https://doi.org/10.1038/nmeth.4551

    Article  Google Scholar 

  68. Hagan MT, Demuth HB, Beale M (1997) Neural network design. PWS Publishing Co, Boston

    Google Scholar 

  69. Chang Q, Chen Q, Wang X (2005) Scaling gaussian rbf kernel width to improve svm classification. In: 2005 international conference on neural networks and brain. IEEE, vol 1, pp 19–22. https://doi.org/10.1109/ICNNB.2005.1614559

  70. Karaçalı B, Ramanath R, Snyder WE (2004) A comparative analysis of structural risk minimization by support vector machines and nearest neighbor rule. Pattern Recogn Lett 25(1):63–71. https://doi.org/10.1016/j.patrec.2003.09.002

    Article  Google Scholar 

  71. Zhang H, Berg AC, Maire M, Malik J (2006) Svm-knn: discriminative nearest neighbor classification for visual category recognition. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06). IEEE, vol 2, pp 2126–2136. https://doi.org/10.1109/CVPR.2006.301

Download references

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: DS; Methodology: DS and BS; Formal analysis and investigation: DS and BS; Writing—original draft preparation: DS; Writing—review and editing: BS.

Corresponding author

Correspondence to Dalwinder Singh.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Consent to participate

Not applicable.

Consent for publication

Both authors agree to submit the article to this journal.

Ethics approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Singh, D., Singh, B. Sensitivity analysis of feature weighting for classification. Pattern Anal Applic 25, 819–835 (2022). https://doi.org/10.1007/s10044-022-01077-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-022-01077-0

Keywords