Phishing Text Message Detection Using Deep Neural Network Models: A Comparative Analysis

Abstract

Social engineering is a manipulation technique that influences individuals into unintentionally sharing private data (Alzahrani, 2020). SMS Phishing (Smishing) attacks use social engineering to trick victims into ignoring security protocols and sharing confidential information (Salahdine & Kaabouch, 2019). Phishing attacks account for 90% of data breaches, and 84% of organizations were targeted by at least one Smishing attempt in 2023 (Verizon Business, 2024). These attacks have become the preferred method for adversaries who exploit victims’ behavior and emotions to gain their trust.

Keeping up with the evolving tactics of cybercriminals has been difficult for traditional Smishing detection models with limited computational resources. This research examines the challenge of detecting and classifying Smishing within a multiclass dataset, focusing on improving the detection of minority classes. A Deep Learning (DL) based phishing detection system building on the work by Mishra & Soni, (2023b) expands detection from binary to multiclass to better identify minority Smishing types using four different feature types.

This research explores the key features that contribute in differentiating Smishing and Spam from legitimate messages using the “SMS Phishing Dataset for Machine Learning and Pattern Recognition” (Mishra & Soni, 2023b) dataset. The analysis finds that URLs and email addresses are the most important features for this classification. After testing various ensemble models, this study shows that a chain transformer model using GPT-2 for generating synthetic data and BERT for embeddings as a multiclass classifier works best for detecting and classifying phishing attacks, especially for rare Smishing data. The MultiChainGuard model shows advanced phishing detection effectiveness by achieving over 97% precision in identifying different types of phishing.

This research fills a gap in current phishing detection capabilities by identifying the best deep learning architecture for Smishing detection using a multiclass dataset. The analysis focuses on deploying the model on devices with limited computational resources using small, open-source models. Integrating multiple deep-learning models to improve Smishing detection advances the SMS Phishing field.

Creating efficient and accurate Smishing detection models can improve user protection against growing cyberattacks. Using open source resources will make this security measure available to more users, helping create a safer online environment.

Details

Business indexing term

Subject:

Artificial intelligence

Subject

Computer engineering;
Multimedia communications;
Artificial intelligence;
Computer science

Classification

0800: Artificial intelligence
0464: Computer Engineering
0984: Computer science

Identifier / keyword

Deep Learning; Multiclass classification; Phishing; Short message service; Smishing attacks; Transformers

Title

Phishing Text Message Detection Using Deep Neural Network Models: A Comparative Analysis

Author

Muñoz, Miriam L.

Number of pages

106

Publication year

2025

Degree date

2025

School code

0075

Source

DAI-B 86/10(E), Dissertation Abstracts International

ISBN

9798310372740

Advisor

Islam, Muhammad F.

Committee member

Etemadi, Amir; Stepanov, Artem

University/institution

The George Washington University

Department

Engineering Management

University location

United States -- District of Columbia

Degree

D.Engr.

Source type

Dissertation or Thesis

Language

English

Document type

Dissertation/Thesis

Dissertation/thesis number

31935850

ProQuest document ID

3188687644

Document URL

https://www.proquest.com/dissertations-theses/phishing-text-message-detection-using-deep-neural/docview/3188687644/se-2?accountid=208611

Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.

Database

ProQuest One Academic

Phishing Text Message Detection Using Deep Neural Network Models: A Comparative Analysis

Content area

Abstract

Details