Content area

Abstract

Background:The opioid epidemic in the United States remains a major public health concern, with opioid-related deaths increasing more than 8-fold since 1999. Chronic pain, affecting 1 in 5 US adults, is a key contributor to opioid use and misuse. While previous research has explored clinical and behavioral predictors of opioid risk, less attention has been given to large-scale linguistic patterns in public discussions of pain. Social media platforms such as X (formerly Twitter) offer real-time, population-level insights into how individuals express pain, distress, and coping strategies. Understanding these linguistic markers matters because they can reveal underlying psychological states, perceptions of health care access, and community-level opioid risk factors, offering new opportunities for early detection and targeted public health response.

Objective:This study aimed to examine linguistic markers of pain communication on the social media platform X and assess whether language patterns differ among US states with high and low opioid mortality rates. We also evaluated the predictive power of these linguistic features using machine learning and identified key thematic structures through semantic network analysis.

Methods:We collected 1,438,644 pain-related tweets posted between January and December 2021 using tweepy and snscrape. Tweets from 2 high-opioid mortality states (Ohio and Florida) and 2 low opioid mortality states (South and North Dakota) were selected, resulting in 31,994 tweets from high-death states (HDS) and 750 tweets from low-death states (LDS). Six machine learning algorithms (random forest, k-nearest neighbor, decision tree, naive Bayes, logistic regression, and support vector machine) were applied to predict state-level opioid mortality risk based on linguistic features derived from Linguistic Inquiry and Word Count. Synthetic Minority Oversampling Technique was used to address class imbalance. Semantic network analysis was conducted to visualize co-occurrence patterns and conceptual clustering.

Results:The random forest model demonstrated the strongest predictive performance, with an accuracy of 94.69%, balanced accuracy of 94.69%, κ of 0.89, and an area under the curve of 0.95 (P<.001). Tweets from HDS contained significantly more affective pain words (t31,992=10.84; P<.001; Cohen d=0.12), health care access references, and expressions of distress. LDS tweets showed greater use of authenticity markers (t31,992=−10.04; P<.001) and proactive health-seeking language. Semantic network analysis revealed denser discourse in HDS (density=0.28) focused on distress and barriers to care, while LDS discourse emphasized recovery and optimism.

Conclusions:Our findings demonstrated that linguistic markers in publicly shared pain-related discourse show distinct and predictable differences across regions with varying opioid mortality risks. These linguistic patterns reflect underlying psychological, social, and structural factors that contribute to opioid vulnerability. Importantly, they offer a scalable, real-time resource for identifying at-risk communities. Harnessing social media language analytics can strengthen early detection systems, guide geographically targeted public health messaging, and inform policy efforts aimed at reducing opioid-related harm and improving pain management equity.

Details

1009240
Business indexing term
Title
Linguistic Markers of Pain Communication on X (Formerly Twitter) in US States With High and Low Opioid Mortality: Machine Learning and Semantic Network Analysis
Publication title
Volume
27
First page
e67506
Publication year
2025
Publication date
2025
Section
Medicine 2.0: Social Media, Open, Participatory, Collaborative Medicine
Publisher
Gunther Eysenbach MD MPH, Associate Professor
Place of publication
Toronto
Country of publication
Canada
e-ISSN
1438-8871
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-05-13
Milestone dates
2024-10-13 (Preprint first published); 2024-10-13 (Submitted); 2025-02-28 (Revised version received); 2025-03-20 (Accepted); 2025-05-13 (Published)
Publication history
 
 
   First posting date
13 May 2025
ProQuest document ID
3222368637
Document URL
https://www.proquest.com/scholarly-journals/linguistic-markers-pain-communication-on-x/docview/3222368637/se-2?accountid=208611
Copyright
© 2025. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2026-01-05
Database
ProQuest One Academic