Abstract
With the further development of Internet technology, various online activities are becoming more frequent, especially online office and online transactions. This trend leads that the network security issues are increasingly prominent, the network security situation is more complex, and the methods and means of attacks are emerging in endlessly. Due to the characteristics of spear-phishing such as target accuracy, attack durability, camouflage concealment and damage severity, it has become the most commonly used initial means for attackers and APT organizations to invade targets. Thus, automated spear-phishing detection based machine learning and deep learning have become the focus of researchers in recent years. However, because of a smaller range and less attack frequency, the number of spear-phishing emails is very limited. How to detect spear-phishing based on machine learning and deep learning with small samples has become a key issue. Meanwhile, in machine learning and deep learning, few-shot learning aims to study a better classification model trained with only a few samples. Therefore, we propose a spear-phishing detection method based on few-shot learning that combines the basic features and the message body of emails. We propose a simple word-embedding model to analyzes the message body, which can process the message body of different lengths into text feature vectors with the same dimension, thus retaining the semantic information to the greatest extent. Then the text feature vectors are combined with the basic features of emails and input into commonly used machine learning classifiers for detection. Our proposed simple word-embedding method does not require the complex training of the model to learn a large number of parameters, thereby reducing the dependence of the model on a large number of training data. The experimental results show that the method proposed in this paper achieves better performance than the existing spear-phishing detection method. Especially, Especially, the advantages of our detection method are more obvious with small samples.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
The MITRE Corporation: Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK), Tactics, initial access. https://attack.mitre.org/tactics/TA0001/
FreeBuf: Analysis on the attack samples of vulnerability exploitation of (2017). https://www.freebuf.com/articles/web/155747.html
FreeBuf: Attack event report of APT organization SideWinder (2019). https://www.freebuf.com/articles/paper/213799.html
Fireye: Best defense against spear-phishing attacks (2018). https://www.fireeye.com/current-threats/best-defense-againstspearphishing-attacks.html
Jansson, K., von Solms, R.: Phishing for phishing awareness. Behav. Inf. Technol. 32, 584–593 (2013)
Nikolaos, T., Nikos, V., Alexios, M.: Browser blacklists: the utopia of phishing protection. E-Bus. Telecommun. 554, 278–293 (2014)
Wang, Y., Agrawal, R., Choi, B.: Light weight anti-phishing with user whitelisting in a web browser. In: 2008 IEEE Region 5 Conference, pp. 39–42 (2008)
Jain, A., Gupta, B.: A novel approach to protect against phishing attacks at client side using auto-updated white-list. EURASIP J. Inf. Secur. 1, 2016 (2016)
Marchal, S., François, J., State, R.: Proactive discovery of phishing related domain names. In: Balzarotti, D., Stolfo, S.J., Cova, M. (eds.) RAID 2012. LNCS, vol. 7462, pp. 190–209. Springer, London (2012). https://doi.org/10.1007/978-3-642-33338-5_10
Cao, Y., Han, W., Le, Y.: Anti-phishing based on automated individual white-list. In: Proceedings of the 4th ACM Workshop on Digital Identity Management, pp. 278–293 (2008)
Nissim, N., Cohen, A., Glezer, C., Elovici, Y.: Detection of malicious PDF files and directions for enhancements: a state-of-the art survey. Comput. Secur. 48, 246–266 (2015)
Han, X., Kheir, N., Balzarotti, D.: PhishEye: live monitoring of sandboxed phishing kits. In: ACM SIGSAC Conference on Computer & Communications Security, pp. 1402–1413 (2017)
FreeBuf: APT-C-12, Nuclear Crisis Action Revealing (2018). https://www.freebuf.com/column/176675.html
Ho, G., Sharma, A., Javed, M., Paxson, V., Wagner, D.: Detecting credential spearphishing in enterprise settings. In: 26th USENIX Security Symposium (2017)
Han, Y., Shen, Y.: Accurate spear phishing campaign attribution and early detection. In: Proceedings of the 31st Annual ACM Symposium on Applied Computing, pp. 2079–2086 (2016)
Wang, X., Zhang, C., Zheng, K., Tang, H., Tao, Y.: Detecting spear-phishing emails based on authentication. In: IEEE International Conference on Computer and Communication Systems, pp. 450–456 (2019)
Tewari, P., Singh, R.: Machine learning based phishing website detection system. Int. J. Eng. Res. Technol. 4, 172–174 (2015)
Jain, A., Gupta, B.: A machine learning based approach for phishing detection using hyperlinks information. J. Ambient Intell. Humaniz. Comput. 2015–2028 (2018)
Jain, A., Gupta, B.: Comparative analysis of features based machine learning approaches for phishing detection. In: International Conference on Computing for Sustainable Global Development, pp. 2125–2130 (2016)
Abdelhamid, N., Thabtah, F., Abdel-jaber, H.: Phishing detection: a recent intelligent machine learning comparison based on models content and features. In: IEEE International Conference on Intelligence & Security Informatics, pp. 72–77 (2017)
Chiew, K., Tan, C., Wong, K., Yong, K., Tiong, W.: A new hybrid ensemble feature selection framework for machine learning-based phishing detection system. Inf. Sci. 484, 153–166 (2019)
Sahingoz, O., Buber, E., Demir, O., Diri, B.: Machine learning based phishing detection from URLs. Expert Syst. Appl. 117, 345–357 (2019)
Yadollahi, M., Shoeleh, F., Serkani, E., Madani, A., Gharaee, H.: An adaptive machine learning based approach for phishing detection using hybrid features. In: International Conference on Web Research, pp. 281–286 (2019)
Zhu, E., Chen, Y., Ye, C., Li, X., Liu, F.: OFS-NN: an effective phishing websites detection model based on optimal feature selection and neural network. IEEE Access 7, 73271–73284 (2019)
Phoka, T., Suthaphan, P.: Image based phishing detection using transfer learning. In: Annual International Conference on Knowledge and Smart Technology, pp. 232–237 (2019)
Smadi, S., Aslam, N., Zhang, L.: Detection of online phishing email using dynamic evolving neural network based on reinforcement learning. Decis. Support Syst. 107, 88–102 (2018)
Du, Y., Xue, F.: Research of the anti-phishing technology based on e-mail extraction and analysis. In: International Conference on Information Science & Cloud Computing Companion, pp. 60–65 (2014)
Peng, T., Harris, I., Sawa, Y.: Detecting phishing attacks using natural language processing and machine learning. In: IEEE International Conference on Semantic Computing, pp. 300–301 (2018)
Wang, Y., Yao, Q., Kwok, J., Ni, L.: Generalizing from a few examples: a survey on few-shot learning. ACM Comput. Surv. 1(1) (2020)
Huynh-The, T., Hua, C., Kim, D.: Encoding pose features to images with data augmentation for 3-D action recognition. IEEE Trans. Industr. Inf. 16(5), 3100–3111 (2020)
Liu, Z., et al.: Automatic diagnosis of fungal keratitis using data augmentation and image fusion with deep convolutional neural network. Comput. Methods Program. Biomed. 187 (2020)
Wei, J., Zou, K.: EDA: easy data augmentation techniques for boosting performance on text classification tasks. In: Conference on Empirical Methods in Natural Language Processing & International Joint Conference on Natural Language Processing (2019)
Park, D., et al.: SpecAugment: a simple data augmentation method for automatic speech recognition. In: Conference of the International Speech Communication Association (2019)
Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: International Conference on Machine Learning (2015)
Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: Annual Conference on Neural Information Processing Systems, vol. 30 (2017)
Sung, F., Yang, Y., Zhang, L., Xiao, T., Torr, P., Hospedales, T.: Learning to compare: relation network for few-shot learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1199–1208 (2018)
Geng, R., Li, B., Li, Y., Ye, Y., Jian, P., Sun, J.: Few-shot text classification with induction network. In: Conference on Empirical Methods in Natural Language Processing (2019)
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning (2017)
Shen, D., et al.: Baseline needs more love: on simple word-embedding-based models and associated pooling mechanisms. In: Annual Meeting of the Association-for-Computational-Linguistics, pp. 440–450 (2018)
Pan, C., Huang, J., Gong, J., Yuan, X.: Few-shot transfer learning for text classification with lightweight word embedding based models. IEEE Access 7, 53296–53304 (2019)
Maaten, V., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2625 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Li, Q., Cheng, M. (2024). Spear-Phishing Detection Method Based on Few-Shot Learning. In: Li, C., Li, Z., Shen, L., Wu, F., Gong, X. (eds) Advanced Parallel Processing Technologies. APPT 2023. Lecture Notes in Computer Science, vol 14103. Springer, Singapore. https://doi.org/10.1007/978-981-99-7872-4_20
Download citation
DOI: https://doi.org/10.1007/978-981-99-7872-4_20
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7871-7
Online ISBN: 978-981-99-7872-4
eBook Packages: Computer ScienceComputer Science (R0)