Generating Personalized Phishing Emails for Social Engineering Training Based on Neural Language Models

Guo, Shih-Wei; Chen, Tzu-Chi; Wang, Hui-Juan; Leu, Fang-Yie; Fan, Yao-Chung

doi:10.1007/978-3-031-20029-8_26

Shih-Wei Guo¹⁰,
Tzu-Chi Chen¹⁰,
Hui-Juan Wang¹⁰,
Fang-Yie Leu¹¹ &
…
Yao-Chung Fan¹⁰

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 570))

Included in the following conference series:

International Conference on Broadband and Wireless Computing, Communication and Applications

648 Accesses
1 Citations

Abstract

To prevent phishing attacks, social engineering training is a practical way by reinforcing the concepts of being aware of phishing emails. However, existing social engineering training relies on manual planning and artificially design, which suffers from scale and cost concerns. In this paper, we explore the idea of using natural language generation techniques for automatic social engineering training phishing mail generation. We present AI-Phishing, a novel phishing mail generation to facilitate personalized social engineering training planning. Users can utilize AI-Phishing to generate personalized phishing emails according to a given title and keywords.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Automatic Conditional Generation of Personalized Social Media Short Texts

Generative Adversarial Networks for Synthetic Training Data Replacement in Phishing Email Detection Using Natural Language Processing

Identification and Generation of Actions Using Pre-trained Language Models

References

Crestani, F., Lalmas, M., van Rijsbergen, C.J., Campbell, I.: “is this document relevant?... probably” a survey of probabilistic models in information retrieval. ACM Comput. Surv. (CSUR) 30(4), 528–552 (1998)
Google Scholar
Das, S.D., Basak, A., Dutta, S.: A heuristic-driven ensemble framework for COVID-19 fake news detection. arXiv preprint arXiv:2101.03545 (2021)
Lewis, M., et al.: Bart: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019)
Lin, C.Y.: Rouge: A package for automatic evaluation of summaries. In: Text summarization branches out, pp. 74–81 (2004)
Google Scholar
Lin, T., et al.: Susceptibility to spear-phishing emails: effects of internet user demographics and email content. ACM Trans. Comput.-Hum. Interact. (TOCHI) 26(5), 1–28 (2019)
Article Google Scholar
Misra, R.: News category dataset (2018). https://doi.org/10.13140/RG.2.2.20331.18729
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318 (2002)
Google Scholar
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)
Google Scholar
Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M.: Okapi at TREC-3. Nist Spec. Publ. Sp 109, 109 (1995)
Google Scholar
Salloum, S., Gaber, T., Vadera, S., Shaalan, K.: Phishing email detection using natural language processing techniques: a literature survey. Procedia Comput. Sci. 189, 19–28 (2021)
Article Google Scholar
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988)
Article Google Scholar
Verma, A.: Fraud email dataset. kaggle (2018). https://www.kaggle.com/llabhishekll/fraud-email-dataset
Wolf, T., et al.: Transformers: state-of-the-art natural language processing pp. 38–45 (2020)
Google Scholar
Zellers, R., et al.: Defending against neural fake news. arXiv preprint arXiv:1905.12616 (2019)
Zhu, Y., et al.: A benchmarking platform for text generation models. arxiv 2018. arXiv preprint arXiv:1802.01886 (2018)

Download references

Author information

Authors and Affiliations

National Chung Hsing University, Taichung, Taiwan
Shih-Wei Guo, Tzu-Chi Chen, Hui-Juan Wang & Yao-Chung Fan
TungHai University, Taichung, Taiwan
Fang-Yie Leu

Authors

Shih-Wei Guo
View author publications
You can also search for this author in PubMed Google Scholar
Tzu-Chi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hui-Juan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Fang-Yie Leu
View author publications
You can also search for this author in PubMed Google Scholar
Yao-Chung Fan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fang-Yie Leu .

Editor information

Editors and Affiliations

Department of Information and Communication Engineering, Fukuoka Institute of Technology, Fukuoka, Japan
Leonard Barolli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guo, SW., Chen, TC., Wang, HJ., Leu, FY., Fan, YC. (2023). Generating Personalized Phishing Emails for Social Engineering Training Based on Neural Language Models. In: Barolli, L. (eds) Advances on Broad-Band Wireless Computing, Communication and Applications. BWCCA 2022. Lecture Notes in Networks and Systems, vol 570. Springer, Cham. https://doi.org/10.1007/978-3-031-20029-8_26

Download citation

DOI: https://doi.org/10.1007/978-3-031-20029-8_26
Published: 18 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20028-1
Online ISBN: 978-3-031-20029-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Generating Personalized Phishing Emails for Social Engineering Training Based on Neural Language Models