Abstract
Spam emails are not essential because they include dangerous spyware and viruses. Thus, there is an urgent requirement to identify spam emails due to the adaptive nature of unsolicited email. Different approaches are proposed to detect the spam emails, which are developed by considering the machine learning-based algorithms that aim for minimizing the unnecessary emails and obtain outcomes at an accurate rate for the prediction of spam email. These systems focus on solving the issues of different email spam devastating the system. Moreover, the performance of the conventional models is required to be improved, and so this paper implements the email spam detection model for both image and text datasets. Here, the main contribution is considered as the development of multi-objective feature selection and adaptive capsule network for the email spam detection. While using the text datasets, two feature extraction techniques like Term Variance (TV), and Term Frequency-Inverse Document Frequency (TF-IDF) is used, whereas the Fisher Discriminate Analysis (FDA), Walsh-Hadamard Transform (WHT), and color correlogram are used as the feature extraction techniques for handling image datasets. As the length of the features seems to be long, and for reducing the training complexity, the multi-objective feature selection is performed by the hybrid meta-heuristic algorithm Grey-Sail Fish Optimization (G-SFO) algorithm. Further, a novel adaptive Capsule network is used for email spam detection based on the improvement done by the proposed G-SFO algorithm. The efficiency of the suggested model is evaluated with other existing approaches to show better spam detection accuracy.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41315-021-00217-9/MediaObjects/41315_2021_217_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41315-021-00217-9/MediaObjects/41315_2021_217_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41315-021-00217-9/MediaObjects/41315_2021_217_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41315-021-00217-9/MediaObjects/41315_2021_217_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41315-021-00217-9/MediaObjects/41315_2021_217_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41315-021-00217-9/MediaObjects/41315_2021_217_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41315-021-00217-9/MediaObjects/41315_2021_217_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41315-021-00217-9/MediaObjects/41315_2021_217_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41315-021-00217-9/MediaObjects/41315_2021_217_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41315-021-00217-9/MediaObjects/41315_2021_217_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41315-021-00217-9/MediaObjects/41315_2021_217_Fig11_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41315-021-00217-9/MediaObjects/41315_2021_217_Fig12_HTML.png)
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Abbreviations
- E-mail:
-
Electronic mail
- GA:
-
Genetic algorithm
- RF:
-
Random forest
- NN:
-
Neural networks
- NSA:
-
Negative selection algorithm
- WHT:
-
Walsh-Hadamard transform
- WOA:
-
Whale optimization algorithm
- GMDH:
-
Group method of data handling
- MLP:
-
Multi-layer perceptron
- SVM:
-
Support vector machine
- PSO:
-
Particle Swarm Optimization
- FDA:
-
Fisher discriminate analysis
- FNR:
-
False negative rate
- Qos:
-
Quality of service
- LOF:
-
Local outlier factor
- NB:
-
Naïve Bayes
- MSSCA:
-
Multi-Split Spam Corpus Algorithm
- FPR:
-
False-positive rates
- DE:
-
Differential evolution
- DT:
-
Decision tree
- MCC:
-
Mathews correlation coefficient
- FDR:
-
False discovery rate
- NPV:
-
Negative predictive value
- MAMH:
-
MAS as Metaheuristic
- BMAMH:
-
Binary MAMH
- TFIDF:
-
Term frequency inverse document frequency
- VSA:
-
Vortex search algorithm
References
Abedi, M., Gharehchopogh, F.S.: An improved opposition based learning firefly algorithm with dragonfly algorithm for solving continuous optimization problems. Intell. Data Anal. 24(2), 309–338 (2020)
Al-Rawashdeh, G., Mamat, R., Rahim, N.H.B.A.: Hybrid water cycle optimization algorithm with simulated annealing for spam e-mail detection. IEEE Access 7, 143721–143734 (2019)
Angulakshmi, M., Priya, G.G.L.: Walsh Hadamard transform for simple linear iterative clustering (SLIC) superpixel based spectral clustering of multimodal MRI brain tumor segmentation. IRBM 40, 253–262 (2019)
Awad, W.A., Elseuofi, S.M.: Machine learning methods for spam e-mail classification. Int. J. Comput. Sci. Inf. Technol. 3(1), 173–184 (2011)
Beno, M.M., Valarmathi, I.R., Swamy, S.M., Rajakumar, B.R.: Threshold prediction for segmenting tumour from brain MRI scans. Int. J. Imaging Syst. Technol. 24(2), 129–137 (2014)
Bharti, K.K., Singh, P.K.: Hybrid dimension reduction by integrating feature selection with feature extraction method for text clustering. Expert Syst. Appl. 42, 3105–3114 (2015)
Bhuiyan, H., Ashiquzzaman, A., Juthi, T.I., Biswas, S., Ara, J.: A survey of existing e-mail spam filtering methods considering machine learning techniques. Global J. Comput. Sci. Technol. 1(2), 0975–4172 (2018)
Blanzieri, E., Bryl, A.: A survey of learning-based techniques of email spam filtering. Artif. Intell. Rev. 29, 63–92 (2008)
Bonyadi, M.R., Michalewicz, Z.: Analysis of stability, local convergence, and transformation sensitivity of a variant of the particle swarm optimization algorithm. IEEE Trans. Evol. Comput. 20(3), 370–385 (2016)
Chikh, R., Chikhi, S.: Clustered negative selection algorithm and fruit fly optimization for email spam detection. J. Ambient. Intell. Humaniz. Comput. 10, 143–152 (2019)
Diale, M., Celik, T., Van Der Walt, C.: Unsupervised feature learning for spam email filtering. Comput. Electr. Eng. 74, 89–104 (2019)
Dizaji, Z.A., Gharehchopogh, F.S.: A hybrid of ant colony optimization and chaos optimization algorithms approach for software cost estimation. Indian J. Sci. Technol. 8(2), 128–133 (2015)
El-Alfy, E.-S., Abdel-Aal, R.E.: Using GMDH-based networks for improved spam detection and email feature analysis. Appl. Soft Comput. 11(1), 477–488 (2011)
Faris, H., Al-Zoubi, A.M., Asgharheidari, A., Aljarah, I., Mafarja, M., Hassonah, M.A., Fujita, H.: An intelligent system for spam detection and identification of the most relevant features based on evolutionary random weight networks. Inf. Fusion. 48, 67–83 (2019)
Gbengadadaa, E., et al.: Machine learning for email spam filtering: review, approaches and open research problems. Heliyon 5(6), e01802 (2019)
Gharehchopogh, F.S., Gholizadeh, H.: A comprehensive survey: Whale Optimization Algorithm and its applications. Swarm Evol. Comput. 48, 1–24 (2019)
Gharehchopogh, F.S., Shayanfar, H., Gholizadeh, H.: A comprehensive survey on symbiotic organisms search algorithms. Artif. Intell. Rev. 53(3), 2265–2312 (2020)
Gharehchopogh, F.S., Maleki, I., Dizaji, Z.A.: Chaotic vortex search algorithm: metaheuristic algorithm for feature selection. Evol. Intell. 1–32 (2021)
Gibson, S., Issac, B., Zhang, L., Jacob, S.M.: Detecting spam email with machine learning optimized with bio-inspired metaheuristic algorithms. IEEE Access 8, 187914–187932 (2020)
Guangjun, L., Nazir, S., Khan, H.U., Ul-Haq, A.: Spam detection approach for secure mobile message communication using machine learning algorithms. Secur. Commun. Netw. 2020 (2020)
Idris, I., Selamat, A., Omatu, S.: Hybrid email spam detection model with negative selection algorithm and differential evolution. Eng. Appl. Artif. Intell. 28, 97–110 (2014)
Idris, I., Selamat, A.: Improved email spam detection model with negative selection algorithm and particle swarm optimization. Appl. Soft Comput. 22, 11–27 (2014)
Karim, A., Azam, S., Shanmugam, B., Kannoorpatti, K.: Efficient clustering of emails into spam and ham: the foundational study of a comprehensive unsupervised framework. IEEE Access 8, 154759–154788 (2020)
Kruthika, K.R., Maheshappa, H.D.: CBIR system using capsule networks and 3D CNN for Alzheimer’s disease diagnosis. Inform. Med. Unlocked. 14, 59–68 (2019)
Kumaresan, T., Saravanakumar, S., Balamurugan, R.: Visual and textual features based email spam classification using S-Cuckoo search and hybrid kernel support vector machine. Clust. Comput. 22, 33–46 (2019)
Lopes, C., Cortez, P., Sousa, P., Rocha, M., Rio, M.: Symbiotic filtering for spam email detection. Expert Syst. Appl. 38(8), 9365–9372 (2011)
Mallampati, D., Hegde, N.P.: A machine learning based email spam classification framework model: related challenges and issues. Int. J. Innov. Technol. Explor. Eng. 9(4) (2020)
Mirjalili, S., Mirjalili, S.M., Lewis, A.: Grey wolf optimizer. Adv. Eng. Softw. 69, 46–61 (2014)
Mirjalili, S., Lewis, A.: The Whale Optimization algorithm. Adv. Eng. Softw. 95, 51–67 (2016)
Mohammadzadeh, H., Gharehchopogh, F.S.: A multi-agent system based for solving high-dimensional optimization problems: a case study on email spam detection. Int. J. Commun. Syst. 34(3), e4670 (2021)
Mohmmadzadeh, H., Gharehchopogh, F.S.: An efficient binary chaotic symbiotic organisms search algorithm approaches for feature selection problems. J. Supercomput. (8), 1–43 (2021)
Murugavel, U., Santhi, R.: Detection of spam and threads identification in E-mail spam corpus using content based text analytics method. Mater. Today Proc. 33(Part 7), 3319–3323 (2020)
Naem, A.A., Ghali, N.I., Saleh, A.A.: Antlion optimization and boosting classifier for spam email detection. Future Comput. Inform. J. 3(2), 436–442 (2018)
Olatunji, S.O.: Improved email spam detection model based on support vector machines. Neural Comput. Appl. 31, 691–699 (2019)
Ouyang, Tu., Ray, S., Allman, M., Rabinovich, M.: A large-scale empirical analysis of email spam detection through network characteristics in a stand-alone enterprise. Comput. Netw. 59, 101–121 (2014)
Rahnema, N., Gharehchopogh, F.S.: An improved artificial bee colony algorithm based on whale optimization algorithm for data clustering. Multimedia Tools Appl. 79(43), 32169–32194 (2020)
Ramprasad, M., Chowdary, N.H., Reddy, K.J., Gaurav, V.: Email spam detection using Python and Machine Learning. Turk. J. Physiother. Rehabilit. 32(3), 2651–4451 (2021)
Renuka, K., Hamsapriya, T.: Email classification for spam detection using word stemming. Int. J. Comput. Appl. 5(5), 58–60 (2010)
Shadravan, S., Naji, H.R., Bardsiri, V.K.: The Sailfish Optimizer: a novel nature-inspired metaheuristic algorithm for solving constrained engineering optimization problems. Eng. Appl. Artif. Intell. 80, 20–34 (2019)
Sharma, P., Bhardwaj, U.: Machine learning based spam e-mail detection. Int. J. Intell. Eng. Syst. 11(3) (2017)
Shuaib, M., Abdulhamid, S.M., Adebayo, O.S., et al.: Whale optimization algorithm-based email spam feature selection method using rotation forest algorithm for classification. SN Appl. Sci. 1(390), 1–17 (2019)
Song, Q., Wu, Y., Soh, Y.C.: Robust adaptive gradient-descent training algorithm for recurrent neural networks in discrete time domain. IEEE Trans. Neural Netw. 19(11), 1841–1853 (2008)
Sreedharan, N.P.N., Ganesan, B., Raveendran, R., Sarala, P., Dennis, B., Boothalingam, R.: Grey Wolf optimisation-based feature selection and classification for facial emotion recognition. IET Biom. 7(5), 490–499 (2018)
Sumathi, S., Pugalendhi, G.K.: Cognition based spam mail text analysis using combined approach of deep neural network classifier and random forest. J. Ambient Intell. Humaniz. Comput. 12, 5721–5731 (2020b)
Tsang, S., Kao, B., Yip, K.Y., Ho, W., Lee, S.D.: Decision trees for uncertain data. IEEE Trans. Knowl. Data Eng. 23(1), 64–78 (2011)
Wu, J., Yang, H.: Linear regression-based efficient SVM learning for large-scale classification. IEEE Trans. Neural Netw. Learn. Syst. 26(10), 2357–2369 (2015)
Wu, D., et al.: Deep dynamic neural networks for multimodal gesture segmentation and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 38(8), 1583–1597 (2016)
Yang, Y.: Research and realization of internet public opinion analysis based on improved TF—IDF algorithm. In: 16th International symposium on distributed computing and applications to business, engineering and science (2017)
Zhang, S., Li, X., Zong, M., Zhu, X., Wang, R.: Efficient knn classification with different numbers of nearest neighbors. IEEE Trans. Neural Netw. Learn. Syst. 29(5), 1774–1785 (2018)
Zhang, H., Jolfaei, A., Alazab, M.: A face emotion recognition method using convolutional neural network and image edge computing. IEEE Access 7, 159081–159089 (2019)
Zhao, C., Gao, F.: A nested-loop Fisher discriminant analysis algorithm. Chemom. Intell. Lab. Syst. 146, 396–406 (2015)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Samarthrao, K.V., Rohokale, V.M. A hybrid meta-heuristic-based multi-objective feature selection with adaptive capsule network for automated email spam detection. Int J Intell Robot Appl 6, 497–521 (2022). https://doi.org/10.1007/s41315-021-00217-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41315-021-00217-9