Content area

Abstract

This study examines how simple linear interpolation (SLI) and natural-neighbourinterpolation (NNI) affect machine learning model performance on irregularly sampled commercial data. Seoul bike-sharing rental datasetispre-processed with SLI and NNI to manage missing values and inconsistencies. The performance of SLI and NNI isthen evaluated by constructing various machine learning models, including XGBoost, Random Forest, k-nearest neighbors(KNN) and Stacking model. Results show that SLI consistently improved the accuracy, particularly in the stacking model, as demonstrated by the area under the receiver operating characteristic(AUC) and kolmogorov-smirnow(KS) statistics. Conversely, NNI had more variable outcomes, occasionally reducing performance. The findings underscore the critical role of data pre-processing throughout machine learning, particularly in domains where data irregularities are prevalent, thereby providing empirical support for employing interpolation methods to improve both model reliability and accuracy. Eventually, findings uncovered by this study empirically support data pre-processing for business data modelling, highlighting the critical role of data pre-processing in optimising the performance of machine learning models.

Full text

Turn on search term navigation

Copyright Institute for Local Self-Government and Public Procurement Maribor 2025