Skip to main content

Spear-Phishing Detection Method Based on Few-Shot Learning

  • Conference paper
  • First Online:
Advanced Parallel Processing Technologies (APPT 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14103))

Included in the following conference series:

  • 260 Accesses

Abstract

With the further development of Internet technology, various online activities are becoming more frequent, especially online office and online transactions. This trend leads that the network security issues are increasingly prominent, the network security situation is more complex, and the methods and means of attacks are emerging in endlessly. Due to the characteristics of spear-phishing such as target accuracy, attack durability, camouflage concealment and damage severity, it has become the most commonly used initial means for attackers and APT organizations to invade targets. Thus, automated spear-phishing detection based machine learning and deep learning have become the focus of researchers in recent years. However, because of a smaller range and less attack frequency, the number of spear-phishing emails is very limited. How to detect spear-phishing based on machine learning and deep learning with small samples has become a key issue. Meanwhile, in machine learning and deep learning, few-shot learning aims to study a better classification model trained with only a few samples. Therefore, we propose a spear-phishing detection method based on few-shot learning that combines the basic features and the message body of emails. We propose a simple word-embedding model to analyzes the message body, which can process the message body of different lengths into text feature vectors with the same dimension, thus retaining the semantic information to the greatest extent. Then the text feature vectors are combined with the basic features of emails and input into commonly used machine learning classifiers for detection. Our proposed simple word-embedding method does not require the complex training of the model to learn a large number of parameters, thereby reducing the dependence of the model on a large number of training data. The experimental results show that the method proposed in this paper achieves better performance than the existing spear-phishing detection method. Especially, Especially, the advantages of our detection method are more obvious with small samples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. The MITRE Corporation: Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK), Tactics, initial access. https://attack.mitre.org/tactics/TA0001/

  2. FreeBuf: Analysis on the attack samples of vulnerability exploitation of (2017). https://www.freebuf.com/articles/web/155747.html

  3. FreeBuf: Attack event report of APT organization SideWinder (2019). https://www.freebuf.com/articles/paper/213799.html

  4. Fireye: Best defense against spear-phishing attacks (2018). https://www.fireeye.com/current-threats/best-defense-againstspearphishing-attacks.html

  5. Jansson, K., von Solms, R.: Phishing for phishing awareness. Behav. Inf. Technol. 32, 584–593 (2013)

    Article  Google Scholar 

  6. Nikolaos, T., Nikos, V., Alexios, M.: Browser blacklists: the utopia of phishing protection. E-Bus. Telecommun. 554, 278–293 (2014)

    Google Scholar 

  7. Wang, Y., Agrawal, R., Choi, B.: Light weight anti-phishing with user whitelisting in a web browser. In: 2008 IEEE Region 5 Conference, pp. 39–42 (2008)

    Google Scholar 

  8. Jain, A., Gupta, B.: A novel approach to protect against phishing attacks at client side using auto-updated white-list. EURASIP J. Inf. Secur. 1, 2016 (2016)

    Google Scholar 

  9. Marchal, S., François, J., State, R.: Proactive discovery of phishing related domain names. In: Balzarotti, D., Stolfo, S.J., Cova, M. (eds.) RAID 2012. LNCS, vol. 7462, pp. 190–209. Springer, London (2012). https://doi.org/10.1007/978-3-642-33338-5_10

    Chapter  Google Scholar 

  10. Cao, Y., Han, W., Le, Y.: Anti-phishing based on automated individual white-list. In: Proceedings of the 4th ACM Workshop on Digital Identity Management, pp. 278–293 (2008)

    Google Scholar 

  11. Nissim, N., Cohen, A., Glezer, C., Elovici, Y.: Detection of malicious PDF files and directions for enhancements: a state-of-the art survey. Comput. Secur. 48, 246–266 (2015)

    Article  Google Scholar 

  12. Han, X., Kheir, N., Balzarotti, D.: PhishEye: live monitoring of sandboxed phishing kits. In: ACM SIGSAC Conference on Computer & Communications Security, pp. 1402–1413 (2017)

    Google Scholar 

  13. FreeBuf: APT-C-12, Nuclear Crisis Action Revealing (2018). https://www.freebuf.com/column/176675.html

  14. Ho, G., Sharma, A., Javed, M., Paxson, V., Wagner, D.: Detecting credential spearphishing in enterprise settings. In: 26th USENIX Security Symposium (2017)

    Google Scholar 

  15. Han, Y., Shen, Y.: Accurate spear phishing campaign attribution and early detection. In: Proceedings of the 31st Annual ACM Symposium on Applied Computing, pp. 2079–2086 (2016)

    Google Scholar 

  16. Wang, X., Zhang, C., Zheng, K., Tang, H., Tao, Y.: Detecting spear-phishing emails based on authentication. In: IEEE International Conference on Computer and Communication Systems, pp. 450–456 (2019)

    Google Scholar 

  17. Tewari, P., Singh, R.: Machine learning based phishing website detection system. Int. J. Eng. Res. Technol. 4, 172–174 (2015)

    Google Scholar 

  18. Jain, A., Gupta, B.: A machine learning based approach for phishing detection using hyperlinks information. J. Ambient Intell. Humaniz. Comput. 2015–2028 (2018)

    Google Scholar 

  19. Jain, A., Gupta, B.: Comparative analysis of features based machine learning approaches for phishing detection. In: International Conference on Computing for Sustainable Global Development, pp. 2125–2130 (2016)

    Google Scholar 

  20. Abdelhamid, N., Thabtah, F., Abdel-jaber, H.: Phishing detection: a recent intelligent machine learning comparison based on models content and features. In: IEEE International Conference on Intelligence & Security Informatics, pp. 72–77 (2017)

    Google Scholar 

  21. Chiew, K., Tan, C., Wong, K., Yong, K., Tiong, W.: A new hybrid ensemble feature selection framework for machine learning-based phishing detection system. Inf. Sci. 484, 153–166 (2019)

    Article  Google Scholar 

  22. Sahingoz, O., Buber, E., Demir, O., Diri, B.: Machine learning based phishing detection from URLs. Expert Syst. Appl. 117, 345–357 (2019)

    Article  Google Scholar 

  23. Yadollahi, M., Shoeleh, F., Serkani, E., Madani, A., Gharaee, H.: An adaptive machine learning based approach for phishing detection using hybrid features. In: International Conference on Web Research, pp. 281–286 (2019)

    Google Scholar 

  24. Zhu, E., Chen, Y., Ye, C., Li, X., Liu, F.: OFS-NN: an effective phishing websites detection model based on optimal feature selection and neural network. IEEE Access 7, 73271–73284 (2019)

    Article  Google Scholar 

  25. Phoka, T., Suthaphan, P.: Image based phishing detection using transfer learning. In: Annual International Conference on Knowledge and Smart Technology, pp. 232–237 (2019)

    Google Scholar 

  26. Smadi, S., Aslam, N., Zhang, L.: Detection of online phishing email using dynamic evolving neural network based on reinforcement learning. Decis. Support Syst. 107, 88–102 (2018)

    Article  Google Scholar 

  27. Du, Y., Xue, F.: Research of the anti-phishing technology based on e-mail extraction and analysis. In: International Conference on Information Science & Cloud Computing Companion, pp. 60–65 (2014)

    Google Scholar 

  28. Peng, T., Harris, I., Sawa, Y.: Detecting phishing attacks using natural language processing and machine learning. In: IEEE International Conference on Semantic Computing, pp. 300–301 (2018)

    Google Scholar 

  29. Wang, Y., Yao, Q., Kwok, J., Ni, L.: Generalizing from a few examples: a survey on few-shot learning. ACM Comput. Surv. 1(1) (2020)

    Google Scholar 

  30. Huynh-The, T., Hua, C., Kim, D.: Encoding pose features to images with data augmentation for 3-D action recognition. IEEE Trans. Industr. Inf. 16(5), 3100–3111 (2020)

    Article  Google Scholar 

  31. Liu, Z., et al.: Automatic diagnosis of fungal keratitis using data augmentation and image fusion with deep convolutional neural network. Comput. Methods Program. Biomed. 187 (2020)

    Google Scholar 

  32. Wei, J., Zou, K.: EDA: easy data augmentation techniques for boosting performance on text classification tasks. In: Conference on Empirical Methods in Natural Language Processing & International Joint Conference on Natural Language Processing (2019)

    Google Scholar 

  33. Park, D., et al.: SpecAugment: a simple data augmentation method for automatic speech recognition. In: Conference of the International Speech Communication Association (2019)

    Google Scholar 

  34. Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: International Conference on Machine Learning (2015)

    Google Scholar 

  35. Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: Annual Conference on Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  36. Sung, F., Yang, Y., Zhang, L., Xiao, T., Torr, P., Hospedales, T.: Learning to compare: relation network for few-shot learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1199–1208 (2018)

    Google Scholar 

  37. Geng, R., Li, B., Li, Y., Ye, Y., Jian, P., Sun, J.: Few-shot text classification with induction network. In: Conference on Empirical Methods in Natural Language Processing (2019)

    Google Scholar 

  38. Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning (2017)

    Google Scholar 

  39. Shen, D., et al.: Baseline needs more love: on simple word-embedding-based models and associated pooling mechanisms. In: Annual Meeting of the Association-for-Computational-Linguistics, pp. 440–450 (2018)

    Google Scholar 

  40. Pan, C., Huang, J., Gong, J., Yuan, X.: Few-shot transfer learning for text classification with lightweight word embedding based models. IEEE Access 7, 53296–53304 (2019)

    Article  Google Scholar 

  41. Maaten, V., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2625 (2008)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qi Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Q., Cheng, M. (2024). Spear-Phishing Detection Method Based on Few-Shot Learning. In: Li, C., Li, Z., Shen, L., Wu, F., Gong, X. (eds) Advanced Parallel Processing Technologies. APPT 2023. Lecture Notes in Computer Science, vol 14103. Springer, Singapore. https://doi.org/10.1007/978-981-99-7872-4_20

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7872-4_20

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7871-7

  • Online ISBN: 978-981-99-7872-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics