Skip to main content
Log in

Novel artificial bee colony based feature selection method for filtering redundant information

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Feature selection, which can reduce the dimensions of feature space without sacrificing the performance of the classifier, is an effective technique for text classification. Because many classifiers cannot deal with the features with high dimensions, filtering the redundant information from the original feature space becomes one of the core goals in feature selection field. In this paper, the concept of equivalence word set is introduced and a set of equivalence word sets (represented as EWS 1) is constructed using the rich semantic information of the Open Directory Project (ODP). On this basis, an artificial bee colony based feature selection method is proposed for filtering the redundant information, and a feature subset FS is obtained by using an optimal feature selection (OFS) method and two predetermined thresholds. In order to obtain the best predetermined thresholds, an improved memory based artificial bee colony method (IABCM) is proposed. In the experiments, fuzzy support vector machine (FSVM) and Naïve Bayesian (NB) classifiers are used on six datasets: LingSpam, WebKB, SpamAssian, 20-Newsgroups, Reuters21578 and TREC2007. Experimental results verify that when FSVM and NB are applied, the proposed method is efficient and achieves better accuracy than several representative feature selection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Chen J, Huang H, Tian S et al (2009) Feature selection for text classification with Naïve Bayes [J]. Expert Syst Appl 36(3):5432–5435

    Article  Google Scholar 

  2. Lebanon G, Mao Y, Dillon J (2007) The Locally Weighted Bag of Words Framework for Document Representation [J]. J Mach Learn Res 8(2):2405–2441

    MathSciNet  MATH  Google Scholar 

  3. Turney PD, Pantel P (2010) From frequency to meaning: vector space models of semantics [J]. J Artif Intell Res 37(1):141–188

    MathSciNet  MATH  Google Scholar 

  4. Gheyas IA, Smith LS (2010) Feature subset selection in large dimensionality domains. Pattern Recogn 43(1):5–13

    Article  MATH  Google Scholar 

  5. Uğuz H (2011) A two-stage feature selection method for text classification by using information gain, principal component analysis and genetic algorithm [J]. Knowl-Based Syst 24(7):1024–1032

    Article  Google Scholar 

  6. Azam N, Yao J (2012) Comparison of term frequency and document frequency based feature selection metrics in text classification [J]. Expert Syst Appl 39(5):4760–4768

    Article  Google Scholar 

  7. Liu Y, Wang Y, Feng L et al (2014) Term frequency combined hybrid feature selection method for spam filtering [J]. Pattern Anal Applic 19(2):369–383

    Article  MathSciNet  Google Scholar 

  8. Al-Anzi FS, Abuzeina D (2016) Toward an enhanced Arabic text classification using cosine similarity and Latent Semantic Indexing [J]. Journal of King Saud University - Computer and Information Sciences

  9. Tenenhaus M, Vinzi VE, Chatelin YM et al (2005) PLS path modeling [J]. Comput Stat Data Anal 48:159–205

    Article  MathSciNet  MATH  Google Scholar 

  10. Kruskal JB, Wish M (1978) Multidimensional scaling [M]. Sage

  11. Zhang W, Clark RAJ, Wang Y et al (2016) Unsupervised language identification based on Latent Dirichlet Allocation [J]. Comput Speech Lang 39:47–66

    Article  Google Scholar 

  12. Han M, Ren W (2015) Global mutual information-based feature selection approach using single-objective and multi-objective optimization [J]. Neurocomputing 168:47–54

    Article  Google Scholar 

  13. Kohavi R, John G (1997) Wrappers for feature selection [J]. Artif Intell 97(2):273–324

    Article  MATH  Google Scholar 

  14. Quinlan JR (1986) Induction of decision trees [J]. Mach Learn 1:81–106

    Google Scholar 

  15. Yang Y, Pedersen JO (1997) A comparative study on feature selection in text classification [C]. In: Proceedings of the Fourteenth International Conference on Machine Learning, pp 412–420

    Google Scholar 

  16. Shang W, Huang H, Zhu H, Lin Y, Qu Y, Wang Z (2007) A novel feature selection algorithm for text classification [J]. Expert Syst Appl 33(1):1–5

    Article  Google Scholar 

  17. Yang HH, Moody J (1970) Feature Selection Based on Joint Mutual Information [J]

  18. Yang J, Liu Y, Zhu X, Liu Z, Zhang X (2012) A new feature selection based on comprehensive measurement both in inter-category and intra-category for text classification [J]. Inform Process Manage 48(4):741–754

    Article  Google Scholar 

  19. Wang D, Zhang H, Liu R, Lv W (2012) Feature selection based on term frequency and t-test for text classification [C]. In: ACM International Conference Proceeding Series, pp 1482–1486

    Google Scholar 

  20. Zhang Y, Zhang Z (2012) Feature subset selection with cumulate conditional mutual information minimization [J]. Expert Syst Appl 39(5):6078–6088

    Article  Google Scholar 

  21. Quinlan JR (1986) Induction of decision trees [J]. Mach Learn 1:81–106

    Google Scholar 

  22. Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. IEEE Trans Neural Netw 5(4):537–550

    Article  Google Scholar 

  23. Lin Y, Hu Q, Liu J et al (2015) Multi-label feature selection based on max-dependency and min-redundancy[J]. Neurocomputing 168:92–103

    Article  Google Scholar 

  24. Ševa J., Schatten M, Grd P (2015) Open directory project based universal taxonomy for personalization of online (Re)sources [J]. Expert Syst Appl 42:6306–6314

    Article  Google Scholar 

  25. Perugini S (2008) Symbolic links in the open directory project [J]. Inf Process Manag 44:910–930

    Article  Google Scholar 

  26. Foraker S, Murphy GL (2012) Polysemy in sentence comprehension: Effects of meaning dominance [J]. J Mem Lang 67(4):407–425

    Article  Google Scholar 

  27. Koch MR, Pavlić M, Katić MA (2015) Homonyms and Synonyms in NOK Method [J]. Procedia Eng 100:1055–1061

    Article  Google Scholar 

  28. WordNet 2.0. [14 August 2008]. Available from: http://wordnet.princeton.edu/oldversions

  29. Huang KC, Geller J, Halper M et al (2009) Using WordNet synonym substitution to enhance UMLS source integration - Artificial Intelligence in Medicine [J]. Artif Intell Med 46(2):97– 109

    Article  Google Scholar 

  30. Kennedy J (2010) Particle swarm optimization [J]. Encyclopedia of Machine Learning, Springer US, pp 760–766

    Google Scholar 

  31. Geem ZW, Kim JH, Loganathan GV (2001) A new heuristic optimization algorithm: Harmony search [J]. Simulation 76(2):60–68

    Article  Google Scholar 

  32. Pan WT (2012) A new fruit fly optimization algorithm: taking the financial distress model as an example [J]. Knowl-Based Syst 26:69–74

    Article  Google Scholar 

  33. Karaboga D, Akay B (2009) A comparative study of artificial bee colony algorithm [J]. Appl Math Comput 214(1):108–132

    MathSciNet  MATH  Google Scholar 

  34. Karaboga D, Basturk B (2007) A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm [J]. J Glob Optim 39(3):459–471

    Article  MathSciNet  MATH  Google Scholar 

  35. Li XN, Yang GF (2016) Artificial bee colony algorithm with memory [J]. Appl Soft Comput 41:362–372

    Article  Google Scholar 

  36. Yang J, Liu Y, Liu Z et al (2011) A new feature selection algorithm based on binomial hypothesis testing for spam filtering [J]. Knowl-Based Syst 24(6):904–914

    Article  Google Scholar 

  37. SpamAssassin (2005) Spamassassin public corpus. http://spamassassin.apache.org/publiccorpus/. Accessed June 2008

  38. Cormack GV TREC 2007 spam track overview [C]. In: Proceedings of TREC 2007: the 16th text retrieval conference

  39. Porter MF (1997) An algorithm for suffix stripping [M]. Readings in information retrieval, Morgan Kaufmann Publishers Inc, Kaufmann

  40. Lin C, Wang S (2002) Fuzzy Support Vector Machines [J]. IEEE Trans Neural Netw 13(2):464–471

    Article  Google Scholar 

  41. Nikhil RP, Kuhu P, James MK, James CB (2005) A possibilistic fuzzy c-means clustering algorithm [J]. IEEE Trans Fuzzy Syst 13(4):517–530

    Article  Google Scholar 

  42. McCallum A, Nigam K (2007) A comparison of event models for naive Bayes text classification [C]. In: EACL ’03 Proceedings of the Tenth Conference on European Chapter of the Association for Computational Linguistics, vol 1, pp 307–314

    Google Scholar 

  43. Wang YW, Liu Y, Zhu X (2014) Two-step based hybrid feature selection method for spam filtering [J]. J Intell Fuzzy Syst 27(6):2785–2796

    Google Scholar 

  44. Wang Y, Liu Y, Feng L et al (2015) Novel feature selection method based on harmony search for email classification [J]. Knowl-Based Syst 73:311–323

    Article  Google Scholar 

  45. Pan QK, Sang HY, Duan JH et al (2014) An improved fruit fly optimization algorithm for continuous function optimization problems [J]. Knowl-Based Syst 62:69–83

    Article  Google Scholar 

  46. Kasuya E (2010) Wilcoxon signed-ranks test: symmetry should be confirmed before the test [J]. Animal Behav 79(3):765–767

    Article  Google Scholar 

Download references

Acknowledgments

This research was supported by the Beijing Natural Science Foundation, under grant no. 4174105, the Joint Funds of the National Natural Science Foundation of China, under grant no. U1509214, and the Discipline Construction Foundation of the Central University of Finance and Economics, under grant no. 2016XX02.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Youwei Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Feng, L. & Zhu, J. Novel artificial bee colony based feature selection method for filtering redundant information. Appl Intell 48, 868–885 (2018). https://doi.org/10.1007/s10489-017-1010-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-017-1010-4

Keywords

Navigation