Content area

Abstract

A dataset exhibits the class imbalance problem when a target class has a very small number of instances relative to other classes. A trivial classifier typically fails to detect a minority class due to its extremely low incidence rate. In this paper, a new over-sampling technique called DBSMOTE is proposed. Our technique relies on a density-based notion of clusters and is designed to over-sample an arbitrarily shaped cluster discovered by DBSCAN. DBSMOTE generates synthetic instances along a shortest path from each positive instance to a pseudo-centroid of a minority-class cluster. Consequently, these synthetic instances are dense near this centroid and are sparse far from this centroid. Our experimental results show that DBSMOTE improves precision, F-value, and AUC more effectively than SMOTE, Borderline-SMOTE, and Safe-Level-SMOTE for imbalanced datasets.[PUBLICATION ABSTRACT]

Details

Title
DBSMOTE: Density-Based Synthetic Minority Over-sampling TEchnique
Author
Bunkhumpornpat, Chumphol; Sinapiromsaran, Krung; Lursinsap, Chidchanok
Pages
664-684
Publication year
2012
Publication date
Apr 2012
Publisher
Springer Nature B.V.
ISSN
0924669X
e-ISSN
1573-7497
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
940970304
Copyright
Springer Science+Business Media, LLC 2012