A three-layer back-propagation neural network for spam detection using artificial immune concentration

Ruan, Guangchen; Tan, Ying

doi:10.1007/s00500-009-0440-2

A three-layer back-propagation neural network for spam detection using artificial immune concentration

Focus
Published: 03 June 2009

Volume 14, pages 139–150, (2010)
Cite this article

Soft Computing Aims and scope Submit manuscript

Guangchen Ruan¹ &
Ying Tan¹

737 Accesses
51 Citations
Explore all metrics

Abstract

In this paper, a three-layer back-propagation neural network (BPNN) is employed for spam detection by using a concentration based feature construction (CFC) approach. In the CFC approach, ‘self’ and ‘non-self’ concentrations are constructed through ‘self’ and ‘non-self’ gene libraries, respectively, to form a two-element concentration vector for expressing the e-mail efficiently. A three-layer BPNN with two-element input is then employed to classify e-mails automatically. Comprehensive experiments are conducted on two public benchmark corpora PU1 and Ling to demonstrate that the proposed CFC approach based BPNN classifier not only has a very much fast speed but also achieves 97 and 99% of classification accuracy on corpora PU1 and Ling by just using a two-element concentration feature vector.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comparative analysis of gradient boosting algorithms

Article 24 August 2020

Modeling Hybrid Feature-Based Phishing Websites Detection Using Machine Learning Techniques

Article 21 March 2022

A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification

Article 05 March 2020

Notes

The PU1 corpus and Ling corpus may be downloaded from http://www.cil.pku.edu.cn/resources/.

References

Androutsopoulos I, Koutsias J, Chandrinos KV, Spyropoulos CD (2000a) An experimental comparison of Naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages. In: Proceedings of the 23rd ACM SIGIR conference on research and development in information retrieval, pp 160–167
Androutsopoulos I, Koutsias J, Chandrinos KV, Paliouras G, Spyropoulos CD (2000b) An evaluation of Naive Bayesian anti-spam filtering. In: Proceedings of European conference on machine learning (ECML 2000)
Bhattacharyya M, Schultz M (2002) MET: an experimental system for Malicious email tracking. In: Proceedings of new security paradigms workshop, pp 3–10
Brendel R, Krawczyk H (2007) Detection methods of dynamic spammers’ behavior. In: International conference on dependability of computer systems, pp 145–152
Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm
Clark J, Koprinska I, Poon J (2003) A neural network based approach to automated e-mail classification. In: Proceedings of IEEE international conference on web intelligence (WI 2003), pp 702–705
Drucker H, Wu DH, Vapnik VN (1999) Support vector machines for spam categorization. IEEE Trans Neural Netw 10:1048–1054
Google Scholar
Gunal S, Ergin S, Gulmezoglu MB, Gerek ON (2006) On feature extraction for spam e-mail detection. Lectures Notes on Computer Science. Springer, Berlin, pp 635–642
Katakis I, Tsoumakas G, Vlahavas I (2006) Dynamic feature space and incremental feature selection for the classification of textual data streams. In: Proceedings of international workshop on knowledge discovery from data streams, pp 107–116
Koprinska I, Poon J, Clark J, Chan J (2007) Learning to classify e-mail. Inf Sci, pp 2167–2187
Leiba B, Borenstein N (2004) A multifaceted approach to spam reduction. In: Proceedings of the first conference on email and antispam (CEAS 2004)
Li Y, Fang BX, Guo L, Wang S (2006) Research of a novel anti-spam technique based on users’s feedback and improved Naive Bayesian approach. In: Proceedings of IEEE international conference on networking and services (ICNS 2006), pp 86–91
Oda T, White T (2003) Increasing the accuracy of a spam-detecting artificial immune system. In: Proceedings of IEEE congress on evolutionary computation (CEC 2003), pp 390–396
Oda T, White T (2005) Immunity from spam: an analysis of an artificial immune system for junk email detection. In: International conference on artificial immune systems (ICARIS 2005)
Rigoutsos I, Floratos A (1998) Combinatorial pattern discovery in biological sequences: the TEIRESIAS algorithm. Bioinformatics, pp 55–67
Rigoutsos I, Huynh T (2004) Chung–Kwei: a pattern-discovery-based system for the automatic identification of unsolicited e-mail messages (SPAM). In: Proceedings of the first conference on email and antispam (CEAS 2004)
Ruan GC, Tan Y (2007) Intelligent detection approaches for spam. In: The third international conference on natural computation (ICNC 2007) vol 3, August 24–27, Haikou, China, pp 672–676
Ruan GC, Tan Y (2008) Uninterrupted approaches for spam detection based on SVM and AIS. IEEE Trans Syst Man Cybern B
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors, Nature, pp 533–536
Secker A, Freitas AA, Timmis J (2003) AISEC: an artificial immune system for email classification. In: Proceedings of IEEE congress on evolutionary computation (CEC 2003), pp 131–139
Shrestha R, Lin YP (2005) Improved bayesian spam filtering based on co-weighted multi-area information, Lecture Notes in Artificial Intelligence. Springer, Berlin, pp 650–660
Stolfo S, Hershkop S (2006) Behavior-based modeling and its application to email analysis. ACM Trans Internet Technol 6:187–221
Article Google Scholar
Stuart I, Cha SH, Tappert C (2004) A neural network classifier for junk e-mail. Lecture Notes on Computer Science. Springer, Berlin, pp 442–450
Tan Y, Ruan GC (2007) Recognition of electronic junk mails based on artificial immune system. In: The third joint workshop on machine perception and robotics (MPR 2007), Nov 25–27, Ritsumeikan University, Japan
Tan Y, Wang J (2004) A support vector network with hybrid Kernel and minimal Vapnik–Chervonenkis dimension. IEEE Trans Knowl Data Eng pp 385–395
Wang R, Youssef AM, Elhakeem AK (2006) On some feature selection strategies for spam filter design. In: Proceedings of Canadian conference on electrical and computer engineering, pp 2186–2189
Wu MW, Huang Y, Lu SK, Chen IY, Kuo SY (2005) A multi-faceted approach towards spam-resistible mail. In: Proceedings of IEEE Pacific Rim international symposium on dependable computing, pp 208–218
Yeh CY, Wu CH, Doong SH (2005) Effective spam classification based on meta-heuristics. In: Proceedings of IEEE international conference on systems, man and cybernetics, pp 3872–3877

Download references

Acknowledgments

This work was supported by the National High Technology Research and Development Program of China (863 Program), with grant number 2007AA01Z453, and partially supported by National Natural Science Foundation of China (NSFC), under grant number 60673020 and 60875080. Authors would like to highly appreciate editor and three anonymous referees for their insightful comments and suggestions, which greatly help to improve the quality and presentation of this paper.

Author information

Authors and Affiliations

Key Laboratory of Machine Perception (MOE), Department of Machine Intelligence, School of Electronics Engineering and Computer Science, Peking University, 100871, Beijing, China
Guangchen Ruan & Ying Tan

Authors

Guangchen Ruan
View author publications
You can also search for this author in PubMed Google Scholar
Ying Tan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Tan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ruan, G., Tan, Y. A three-layer back-propagation neural network for spam detection using artificial immune concentration. Soft Comput 14, 139–150 (2010). https://doi.org/10.1007/s00500-009-0440-2

Download citation

Published: 03 June 2009
Issue Date: January 2010
DOI: https://doi.org/10.1007/s00500-009-0440-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A three-layer back-propagation neural network for spam detection using artificial immune concentration

Abstract

Access this article

Similar content being viewed by others

A comparative analysis of gradient boosting algorithms

Modeling Hybrid Feature-Based Phishing Websites Detection Using Machine Learning Techniques

A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A three-layer back-propagation neural network for spam detection using artificial immune concentration

Abstract

Access this article

Similar content being viewed by others

A comparative analysis of gradient boosting algorithms

Modeling Hybrid Feature-Based Phishing Websites Detection Using Machine Learning Techniques

A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation