Abstract

Twitter has been utilized to distribute opinions from netizens directly to public service providers, such as Jakarta Bus Rapid Transit (BRT), in an efficient and effective way. In this context, the opinions formatted in textual data can be analyzed to help BRT operators improve their facilities and services via sentiment analysis, which consists of multiple steps: preprocessing, feature weighting, classification, and evaluation. The preprocessing and feature weighting are the key processes that may significantly affect the classification algorithm performance. Several researches have investigated these key preocesses, specifically to observe its effect in classification performance. However, none of those researches compare n-gram feature tokenization with feature weighting in Bahasa Indonesia. The present study compares the combination of n-gram feature tokenization with feature weighting to the performance of Support Vector Machine algorithm. The present study utlilizes TF-IDF, TF-CHI, TF-RF, and TF-OR as the feature weighting scheme. The results show that TF-IDF has the highest performance of 79.3% (accuracy), 83.2% (precision), and 83.6% (recall), and 82.2% (F1 score).

Details

Title
Comparison of Feature Weighting in SVM Performance for Sentiment Analysis of Jakarta BRT
Author
Widyawan 1 ; Damayanti, Nourma Reizky 1 ; Adji, Teguh Bharata 1 ; Guntur, Dharma Putra 1 

 Department of Electrical Engineering and Information Technology, Faculty of Engineering, Universitas Gadjah Mada, Yogyakarta, Indonesia 
Publication year
2019
Publication date
Mar 2019
Publisher
IOP Publishing
ISSN
17426588
e-ISSN
17426596
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2566088441
Copyright
© 2019. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.