Abstract

The domain of Natural Language Processing covers various tasks, such as classification, text generation, and language model. The data processed using word embeddings, or vectorizers, is then trained using Machine Learning and Deep Learning algorithms. In order to observe the tradeoff between both these types of algorithms, with respect to data available, accuracy obtained and other factors, a binary classification is undertaken to distinguish between insincere and regular questions on Quora. A dataset called Quora Insincere Questions Classification was used to train various machine learning and deep learning models. A Bidirectional-Long Short Term Network (LSTM) was trained, with the text processed using Global Vectors for Word Representation (GloVe). Machine Learning algorithms such as Extreme Gradient Boosting classifier, Gaussian Naive Bayes, and Support Vector Classifier (SVC), by using the TF-IDF vectorizer to process the text. This paper also presents an evaluation of the above algorithms on the basis of precision, recall, f1 score metrics.

Details

Title
A Trade-off between ML and DL Techniques in Natural Language Processing
Author
Singh, Bhavesh 1 ; Desai, Rahil 1 ; Ashar, Himanshu 1 ; Tank, Parth 1 ; Katre, Neha 1 

 Dwarkadas J. Sanghvi College of Engineering, Mumbai - 400056, India 
Publication year
2021
Publication date
Mar 2021
Publisher
IOP Publishing
ISSN
17426588
e-ISSN
17426596
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2512915942
Copyright
© 2021. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.