research-article

Hybrid CNN-GRU Framework with Integrated Pre-trained Language Transformer for SMS Phishing Detection

Authors:
Rubaiath E Ulfath

Department of Computer Science and Engineering, Chittagong University of Engineering & Technology, Bangladesh

Department of Computer Science and Engineering, Chittagong University of Engineering & Technology, Bangladesh
View Profile

,
Hamed Alqahtani

King Khalid University, Saudi Arabia

King Khalid University, Saudi Arabia
View Profile

,
Mohammad Hammoudeh

Department of Computing & Math, Manchester Metropolitan University, United Kingdom

Department of Computing & Math, Manchester Metropolitan University, United Kingdom
View Profile

,
Iqbal H. Sarker

Department of Computer Science and Engineering, Chittagong University of Engineering & Technology, Bangladesh

Department of Computer Science and Engineering, Chittagong University of Engineering & Technology, Bangladesh
View Profile

ICFNDS 2021: The 5th International Conference on Future Networks & Distributed SystemsDecember 2021Pages 244–251https://doi.org/10.1145/3508072.3508109

Published:13 April 2022Publication History

ICFNDS 2021: The 5th International Conference on Future Networks & Distributed Systems

Pages 244–251

ABSTRACT

Smartphones are prone to SMS phishing due to the rapid growth in the availability of smart mobile technologies driven by Internet connections. Also, detecting phishing SMS is a challenging task due to the unstructured nature of SMS text data with non-linear complex correlations. In this concern, considering the recent advancements in the domain of cybersecurity, we have proposed a hybrid deep learning framework that extracts robust features from SMS texts followed by an automatic detection of Phishing SMS. Due to combining the potential capability of individual models into one hybrid framework, it has outperformed various other individual machine learning and deep learning models. The proposed Phishing Detection framework is an effective hybrid combination of pretrained transformer model, MPNet (Masked and Permuted Language Modeling), with supervised ConvNets (CNN) and Bi-directional Gated Recurrent Units (GRU). It is intended to successfully detect unstructured short phishing text messages that contain complex patterns.

References

2011. Text Message Spam Infographic. https://www.tatango.com/blog/text-message-spam-infographic/Google Scholar
2012. UCI Machine Learning Repository: SMS Spam Collection Data Set. https://archive.ics.uci.edu/ml/datasets/sms+spam+collectionGoogle Scholar
2017. Daily SMS Mobile Usage Statistics. https://www.smseagle.eu/2017/03/06/daily-sms-mobile-statistics/Google Scholar
2018. Mobile Phishing Report 2018. Technical Report. https://www.wandera.com/mobile-phishing-report/Google Scholar
2021. Mobile Phishing Increases More Than 300% as 2020 Chaos Continues | Proofpoint US. https://www.proofpoint.com/us/blog/threat-protection/mobile-phishing-increases-more-300-2020-chaos-continuesGoogle Scholar
Sahar Bosaeed, Iyad Katib, and Rashid Mehmood. 2020. A Fog-Augmented Machine Learning based SMS Spam Detection and Classification System. In 2020 Fifth International Conference on Fog and Mobile Edge Computing (FMEC). 325–330. https://doi.org/10.1109/FMEC49853.2020.9144833Google ScholarCross Ref
Badr Eddine Boukari, Akshaya Ravi, and Mounira Msahli. 2021. Machine Learning Detection for SMiShing Frauds. In 2021 IEEE 18th Annual Consumer Communications Networking Conference (CCNC). 1–2. https://doi.org/10.1109/CCNC49032.2021.9369640Google Scholar
E. Burke-Kennedy, J. Brennan, and C. Taylor. 2020. Bank of Ireland does U-turn after refusal to reimburse ‘smishing’ victims. https://www.irishtimes.com/business/financial-services/bank-of-ireland-does-u-turn-after-refusal-to-reimburse-smishing-victims-1.4326502Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR abs/1810.04805(2018). arXiv:1810.04805http://arxiv.org/abs/1810.04805Google Scholar
Abdallah Ghourabi, Mahmood A. Mahmood, and Qusay M. Alzubi. 2020. A Hybrid CNN-LSTM Model for SMS Spam Detection in Arabic and English Messages. Future Internet 12, 9 (2020). https://doi.org/10.3390/fi12090156Google Scholar
Diksha Goel and Ankit Kumar Jain. 2017. Smishing-classifier: a novel framework for detection of smishing attack in mobile environment. In International conference on next generation computing technologies. Springer, 502–512.Google Scholar
Wael Hassan Gomaa. 2020. The Impact of Deep Learning Techniques on SMS Spam Filtering. International Journal of Advanced Computer Science and Applications 11, 1(2020). https://doi.org/10.14569/IJACSA.2020.0110167Google ScholarCross Ref
Paul A Grassi, James L Fenton, Elaine M Newton, Ray A Perlner, Andrew R Regenscheid, William E Burr, Justin P Richer, Naomi B Lefkovitz, Jamie M Danker, Yee-Yin Choong, 2020. Digital identity guidelines: Authentication and lifecycle management [includes updates as of 03-02-2020]. (2020).Google Scholar
Gauri Jain, Manisha Sharma, and Basant Agarwal. 2019. Optimizing semantic LSTM for spam detection. International Journal of Information Technology 11, 2 (01 Jun 2019), 239–250. https://doi.org/10.1007/s41870-018-0157-5Google ScholarCross Ref
Onur Karasoy and Serkan Ballı. 2021. Spam SMS detection for Turkish language with deep text analysis and deep learning methods. https://link.springer.com/article/10.1007/s13369-021-06187-1Google Scholar
Sumit Kumar, Arup Kumar Pal, SK Hafizul Islam, and Mohammad Hammoudeh. 2021. Secure and efficient image retrieval through invariant features selection in insecure cloud environments. Neural Computing and Applications(2021), 1–26.Google Scholar
Xiaoxu Liu, Haoye Lu, and Amiya Nayak. 2021. A Spam Transformer Model for SMS Spam Detection. IEEE Access 9(2021), 80253–80263. https://doi.org/10.1109/ACCESS.2021.3081479Google ScholarCross Ref
Sandhya Mishra and Devpriya Soni. 2020. Smishing Detector: A security model to detect smishing through SMS content analysis and URL behavior analysis. Future Generation Computer Systems 108 (2020), 803–815. https://doi.org/10.1016/j.future.2020.03.021Google ScholarCross Ref
Next Caller. 2020. Next Caller’s Fraud & COVID-19 Report. Technical Report (Week 2 & 3). https://nextcaller.com/blog/next-caller-covid-19-fraud-report/Google Scholar
XiPeng Qiu, TianXiang Sun, YiGe Xu, YunFan Shao, Ning Dai, and XuanJing Huang. 2020. Pre-trained models for natural language processing: A survey. Science China Technological Sciences 63, 10 (01 Oct 2020), 1872–1897. https://doi.org/10.1007/s11431-020-1647-3Google ScholarCross Ref
Sergio Rojas-Galeano. 2021. Using BERT Encoding to Tackle the Mad-lib Attack in SMS Spam Detection. arxiv:2107.06400 [cs.CL]Google Scholar
Jibran Saleem and Mohammad Hammoudeh. 2018. Defense methods against social engineering attacks. In Computer and network security essentials. Springer, 603–618.Google Scholar
Iqbal H Sarker. 2021. CyberLearning: Effectiveness analysis of machine learning security modeling to detect cyber-anomalies and multi-attacks. Internet of Things 14(2021), 100393.Google ScholarCross Ref
Iqbal H Sarker. 2021. Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions. SN Computer Science 2, 6 (2021), 1–20.Google ScholarDigital Library
Iqbal H Sarker, Md Hasan Furhad, and Raza Nowrozy. 2021. AI-driven cybersecurity: an overview, security intelligence modeling and research directions. SN Computer Science 2, 3 (2021), 1–18.Google ScholarDigital Library
Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, and Tie-Yan Liu. 2020. MPNet: Masked and Permuted Pre-training for Language Understanding. arxiv:2004.09297 [cs.CL]Google Scholar
Gunikhan Sonowal. 2020. Detecting Phishing SMS Based on Multiple Correlation Algorithms. SN Computer Science 1, 6 (2020), 1–9.Google ScholarCross Ref
Gunikhan Sonowal and K S Kuppusamy. 2018. SmiDCA: An Anti-Smishing Model with Machine Learning Approach. Comput. J. 61, 8 (04 2018), 1143–1157. https://doi.org/10.1093/comjnl/bxy039 arXiv:https://academic.oup.com/comjnl/article-pdf/61/8/1143/25209236/bxy039.pdfGoogle Scholar
Xu Tan. 2020. MPNet combines strengths of masked and permuted language modeling for language understanding. https://www.microsoft.com/en-us/research/blog/mpnet-combines-strengths-of-masked-and-permuted-language-modeling-for-language-understanding/Google Scholar
Rubaiath E. Ulfath, Iqbal H. Sarker, Mohammad Jabed Morshed Chowdhury, and Mohammad Hammoudeh. 2022. Detecting Smishing Attacks Using Feature Extraction and Classification Techniques. In Proceedings of the International Conference on Big Data, IoT, and Machine Learning. Springer Singapore, Singapore, 677–689.Google ScholarCross Ref
Feng Wei and Trang Nguyen. 2020. A Lightweight Deep Neural Model for SMS Spam Detection. In 2020 International Symposium on Networks, Computers and Communications (ISNCC). 1–6. https://doi.org/10.1109/ISNCC49221.2020.9297350Google ScholarCross Ref
Tian Xia and Xuemin Chen. 2020. A Discrete Hidden Markov Model for SMS Spam Detection. Applied Sciences 10, 14 (2020). https://doi.org/10.3390/app10145011Google Scholar
Shudong Yang, Xueying Yu, and Ying Zhou. 2020. LSTM and GRU Neural Network Performance Comparison Study: Taking Yelp Review Dataset as an Example. In 2020 International Workshop on Electronic Communication and Artificial Intelligence (IWECAI). 98–101. https://doi.org/10.1109/IWECAI50956.2020.00027Google ScholarCross Ref
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Vol. 32. Curran Associates, Inc.https://proceedings.neurips.cc/paper/2019/file/dc6a7e655d7e5840e66733e9ee67cc69-Paper.pdfGoogle Scholar

Index Terms

Hybrid CNN-GRU Framework with Integrated Pre-trained Language Transformer for SMS Phishing Detection

Index terms have been assigned to the content through auto-classification.

Recommendations

Applications of deep learning for phishing detection: a systematic literature review
Abstract
Phishing attacks aim to steal confidential information using sophisticated methods, techniques, and tools such as phishing through content injection, social engineering, online social networks, and mobile applications. To avoid and mitigate the ...
Read More
DSmishSMS-A System to Detect Smishing SMS
Abstract
With the origin of smart homes, smart cities, and smart everything, smart phones came up as an area of magnificent growth and development. These devices became a part of daily activities of human life. This impact and growth have made these ...
Read More
Applying machine learning and natural language processing to detect phishing email
Abstract
The growth of online services has been accompanied by increased growth in cyber-attacks. One of the most common effective attacks is phishing, in which attempts are made to steal confidential information by impersonating a legitimate ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICFNDS 2021: The 5th International Conference on Future Networks & Distributed Systems
December 2021
847 pages
ISBN:9781450387347
DOI:10.1145/3508072

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 April 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
AI
Cybersecurity
Deep learning
NLP
Smishing
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 135
  Total Downloads
- Downloads (Last 12 months)74
- Downloads (Last 6 weeks)11
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Hybrid CNN-GRU Framework with Integrated Pre-trained Language Transformer for SMS Phishing Detection

ICFNDS 2021: The 5th International Conference on Future Networks & Distributed Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Applications of deep learning for phishing detection: a systematic literature review

DSmishSMS-A System to Detect Smishing SMS

Applying machine learning and natural language processing to detect phishing email

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Hybrid CNN-GRU Framework with Integrated Pre-trained Language Transformer for SMS Phishing Detection

ICFNDS 2021: The 5th International Conference on Future Networks & Distributed Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Applications of deep learning for phishing detection: a systematic literature review

DSmishSMS-A System to Detect Smishing SMS

Applying machine learning and natural language processing to detect phishing email

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media