Abstract

The Knearest neighbors (kNN) is a lazylearning method for classification and regression that has been successfully applied to several application domains. It is simple and directly applicable to multiclass problems however it suffers a high complexity in terms of both memory and computations. Several research studies try to scale the kNN method to very large datasets using crisp partitioning. In this paper, we propose to integrate the principles of rough sets and fuzzy sets while conducting a clustering algorithm to separate the whole dataset into several parts, each of which is then conducted kNN classification. The concept of crisp lower bound and fuzzy boundary of a cluster which is applied to the proposed algorithm allows accurate selection of the set of data points to be involved in classifying an unseen data point. The data points to be used are a mix of core and border data points of the clusters created in the training phase. The experimental results on standard datasets show that the proposed kNN classification is more effective than related recent work with a slight increase in classification time.

Details

Title
RFKNN: ROUGH-FUZZY KNN FOR BIG DATA CLASSIFICATION
Author
Mahfouz, Mohamed A
Pages
274-279
Publication year
2018
Publication date
Mar 2018
Publisher
International Journal of Advanced Research in Computer Science
e-ISSN
09765697
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2101236705
Copyright
© Mar 2018. This work is published under https://creativecommons.org/licenses/by-nc-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.