Skip to main content

SVM Classifier Incorporating Feature Selection Using GA for Spam Detection

  • Conference paper
Embedded and Ubiquitous Computing – EUC 2005 (EUC 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3824))

Included in the following conference series:

Abstract

The use of SVM (Support Vector Machines) in detecting e-mail as spam or nonspam by incorporating feature selection using GA (Genetic Algorithm) is investigated. An GA approach is adopted to select features that are most favorable to SVM classifier, which is named as GA-SVM. Scaling factor is exploited to measure the relevant coefficients of feature to the classification task and is estimated by GA. Heavy-bias operator is introduced in GA to promote sparse in the scaling factors of features. So, feature selection is performed by eliminating irrelevant features whose scaling factor is zero. The experiment results on UCI Spam database show that comparing with original SVM classifier, the number of support vector decreases while better classification results are achieved based on GA-SVM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cohen, W.W.: Learning rules that classify e-mail. In: Proc. 1996 AAAI Spring Symp. Inform. Access. (1996)

    Google Scholar 

  2. Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian approach to filtering junk e-mail. In: AAAI 1998 Wkshp. Learning for Text Categorization, Madison, WI, July 27 (1998)

    Google Scholar 

  3. Drucker, H., et al.: Support Vector Machines for Spam Categorization. IEEE Transactions on Neural Networks 10(5), 1048–1054 (1999)

    Article  Google Scholar 

  4. Cortes, C., Vapnik, V.: Support -vector networks. Machine Learning (20), 273–297 (1995)

    Google Scholar 

  5. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)

    Article  MATH  Google Scholar 

  6. Guyon, I., Elissee, A.: An introduction to variable and feature selection. Journal of Machine Learning Research (3), 1157–1182 (2003)

    Google Scholar 

  7. Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., Vapnik, V.: Feature selection for support vector machines. In: Neural Information Processing Systems. MIT Press, Cambridge (2001)

    Google Scholar 

  8. Chapelle, O., Vapnik, V., Bousquet, O., Mukherjee, S.: Choosing multiple parameters for support vector machines. Machine Learning (46), 131–159 (2002)

    Google Scholar 

  9. Rakotomamonjy, A.: Variable selection using SVM-based criteria. Journal of Machine Learning Research (3), 1357–1370 (2003)

    Google Scholar 

  10. Krishnapuram, B., Hartemink, A.J., Carin, L., Figueiredo, M.A.T.: A bayesian approach to joint feature selection and classifier design. IEEE Transactions on Pattern Analysis and Machine Intelligence (26), 9: 1105–1111 (2004)

    Google Scholar 

  11. Grandvalet, Y., Canu, S.: Adaptive scaling for feature selection in SVMs. In: Neural Information Processing Systems, vol. 15 (2002)

    Google Scholar 

  12. Srinivas, M., Patnaik, L.: Genetic algorithms: a survey. IEEE Comput. 6(27), 17–26 (1994)

    Google Scholar 

  13. Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, Hb., Yu, Y., Liu, Z. (2005). SVM Classifier Incorporating Feature Selection Using GA for Spam Detection. In: Yang, L.T., Amamiya, M., Liu, Z., Guo, M., Rammig, F.J. (eds) Embedded and Ubiquitous Computing – EUC 2005. EUC 2005. Lecture Notes in Computer Science, vol 3824. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11596356_113

Download citation

  • DOI: https://doi.org/10.1007/11596356_113

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30807-2

  • Online ISBN: 978-3-540-32295-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics