Comparison of Feature Weighting in SVM

Abstract

Twitter has been utilized to distribute opinions from netizens directly to public service providers, such as Jakarta Bus Rapid Transit (BRT), in an efficient and effective way. In this context, the opinions formatted in textual data can be analyzed to help BRT operators improve their facilities and services via sentiment analysis, which consists of multiple steps: preprocessing, feature weighting, classification, and evaluation. The preprocessing and feature weighting are the key processes that may significantly affect the classification algorithm performance. Several researches have investigated these key preocesses, specifically to observe its effect in classification performance. However, none of those researches compare n-gram feature tokenization with feature weighting in Bahasa Indonesia. The present study compares the combination of n-gram feature tokenization with feature weighting to the performance of Support Vector Machine algorithm. The present study utlilizes TF-IDF, TF-CHI, TF-RF, and TF-OR as the feature weighting scheme. The results show that TF-IDF has the highest performance of 79.3% (accuracy), 83.2% (precision), and 83.6% (recall), and 82.2% (F1 score).

Details

Title

Comparison of Feature Weighting in SVM Performance for Sentiment Analysis of Jakarta BRT

Author

Widyawan¹; Damayanti, Nourma Reizky¹; Adji, Teguh Bharata¹; Guntur, Dharma Putra¹

¹ Department of Electrical Engineering and Information Technology, Faculty of Engineering, Universitas Gadjah Mada, Yogyakarta, Indonesia

Publication year

2019

Publication date

Mar 2019

Publisher

IOP Publishing

ISSN

17426588

e-ISSN

17426596

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1088/1742-6596/1196/1/012066

ProQuest document ID

2566088441

© 2019. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Comparison of Feature Weighting in SVM Performance for Sentiment Analysis of Jakarta BRT

Jump to:

Abstract

Details

Suggested sources