Content area

Abstract

As large language models (LLMs) gain traction among researchers and practitioners, particularly in digital marketing for tasks such as customer feedback analysis and automated communication, concerns remain about the reliability and consistency of their outputs. This study investigates annotation bias in LLMs by comparing human and AI-generated annotation labels across sentiment, topic, and aspect dimensions in hotel booking reviews. Using the HRAST dataset, which includes 23,114 real user-generated review sentences and a synthetically generated corpus of 2000 LLM-authored sentences, we evaluate inter-annotator agreement between a human expert and three LLMs (ChatGPT-3.5, ChatGPT-4, and ChatGPT-4-mini) as a proxy for assessing annotation bias. Our findings show high agreement among LLMs, especially on synthetic data, but only moderate to fair alignment with human annotations, particularly in sentiment and aspect-based sentiment analysis. LLMs display a pronounced neutrality bias, often defaulting to neutral sentiment in ambiguous cases. Moreover, annotation behavior varies notably with task design, as manual, one-to-one prompting produces higher agreement with human labels than automated batch processing. The study identifies three distinct AI biases—repetition bias, behavioral bias, and neutrality bias—that shape annotation outcomes. These findings highlight how dataset complexity and annotation mode influence LLM behavior, offering important theoretical, methodological, and practical implications for AI-assisted annotation and synthetic content generation.

Details

1009240
Business indexing term
Company / organization
Title
Biased by Design? Evaluating Bias and Behavioral Diversity in LLM Annotation of Real-World and Synthetic Hotel Reviews
Author
Voutsa, Maria C 1   VIAFID ORCID Logo  ; Tsapatsoulis Nicolas 1   VIAFID ORCID Logo  ; Djouvas Constantinos 2   VIAFID ORCID Logo 

 Department of Communication and Marketing, Cyprus University of Technology, Limassol 3036, Cyprus; [email protected] 
 Department of Communication and Internet Studies, Cyprus University of Technology, Limassol 3036, Cyprus; [email protected] 
Publication title
AI; Basel
Volume
6
Issue
8
First page
178
Number of pages
26
Publication year
2025
Publication date
2025
Publisher
MDPI AG
Place of publication
Basel
Country of publication
Switzerland
e-ISSN
26732688
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-08-04
Milestone dates
2025-06-29 (Received); 2025-07-30 (Accepted)
Publication history
 
 
   First posting date
04 Aug 2025
ProQuest document ID
3243968226
Document URL
https://www.proquest.com/scholarly-journals/biased-design-evaluating-bias-behavioral/docview/3243968226/se-2?accountid=208611
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-12-04
Database
ProQuest One Academic