Content area

Abstract

Clickbait headlines, designed to entice readers with sensationalized or misleading content, pose significant challenges in the digital landscape. They exploit curiosity to generate traffic and revenue, often at the cost of spreading misinformation and undermining the credibility of online content. Identifying clickbait is essential for improving the quality of information consumed, fostering trust in digital media, and enabling users to make informed decisions. This study advances Hebrew clickbait detection through deep learning approaches and comprehensive data augmentation strategies, targeting the unique challenges of processing a low-resource language. Building on prior research that achieved an accuracy of 87% using traditional machine learning methods, this work explores the potential of BERT-based models and diverse augmentation techniques to further enhance performance. Our experiments incorporated a variety of augmentation methods, including weak supervision, substitution-based methods, generative techniques and language-based methods, applied to state-of-the-art Hebrew language models. The results highlight that targeted augmentation strategies, particularly those focusing on word-level replacements and contextual enhancements, consistently improved model performance. Our top-performing configuration achieved an accuracy of 92%, surpassing traditional machine learning benchmarks. These study results can be applied in real-world systems to automatically detect and reduce clickbait in Hebrew digital media, supporting news websites and social platforms in improving content quality and user trust. Furthermore, it provides a replicable framework for tackling similar challenges in other underrepresented languages, highlighting the transformative potential of combining advanced deep learning methods with tailored data augmentation strategies.

Details

1009240
Business indexing term
Title
Towards better Hebrew clickbait detection: Insights from BERT and data augmentation
Publication title
PLoS One; San Francisco
Volume
20
Issue
11
First page
e0332342
Number of pages
22
Publication year
2025
Publication date
Nov 2025
Section
Research Article
Publisher
Public Library of Science
Place of publication
San Francisco
Country of publication
United States
e-ISSN
19326203
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Milestone dates
2025-02-23 (Received); 2025-10-17 (Accepted); 2025-11-06 (Published)
ProQuest document ID
3269575163
Document URL
https://www.proquest.com/scholarly-journals/towards-better-hebrew-clickbait-detection/docview/3269575163/se-2?accountid=208611
Copyright
© 2025 Natanya, Liebeskind. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-11-12
Database
ProQuest One Academic