Full text

Turn on search term navigation

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

With the proliferation of the internet, social networking sites have become a primary source of user-generated content, including vast amounts of information about medications, diagnoses, treatments, and disorders. Comments on previously used medicines, contained within these data, can be leveraged to identify crucial adverse drug reactions, and machine learning (ML) approaches such as sentiment analysis (SA) can be employed to derive valuable insights. However, given the sheer volume of comments, it is often impractical for consumers to manually review all of them before determining a purchase decision. Therefore, drug assessments can serve as a valuable source of medical information for both healthcare professionals and the general public, aiding in decision making and improving public monitoring systems by revealing collective experiences. Nonetheless, the unstructured and linguistic nature of the comments poses a significant challenge for effective categorization, with previous studies having utilized machine and deep learning (DL) algorithms to address this challenge. Despite both approaches showing promising results, DL classifiers outperformed ML classifiers in previous studies. Therefore, the objective of our study was to improve upon earlier research by applying SA to medication reviews and training five ML algorithms on two distinct feature extractions and four DL classifiers on two different word-embedding approaches to obtain higher categorization scores. Our findings indicated that the random forest trained on the count vectorizer outperformed all other ML algorithms, achieving an accuracy and F1 score of 96.65% and 96.42%, respectively. Furthermore, the bidirectional LSTM (Bi-LSTM) model trained on GloVe embedding resulted in an even better accuracy and F1 score, reaching 97.40% and 97.42%, respectively. Hence, by utilizing appropriate natural language processing and ML algorithms, we were able to achieve superior results compared to earlier studies.

Details

Title
Data-Driven Solution to Identify Sentiments from Online Drug Reviews
Author
Haque, Rezaul 1   VIAFID ORCID Logo  ; Saddam Hossain Laskar 1 ; Khushbu, Katura Gania 1   VIAFID ORCID Logo  ; Md Junayed Hasan 2   VIAFID ORCID Logo  ; Uddin, Jia 3   VIAFID ORCID Logo 

 Department of Computer Science and Engineering, East West University, Dhaka 1212, Bangladesh 
 National Subsea Centre, Robert Gordon University, Aberdeen AB10 7AQ, UK 
 Artificial Intelligence and Big Data Department, Endicott College, Woosong University, 171 Dongdaejeon-ro (155-3 Jayang-dong), Daejeon 300718, Republic of Korea 
First page
87
Publication year
2023
Publication date
2023
Publisher
MDPI AG
e-ISSN
2073431X
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2806508765
Copyright
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.