Content area
This study aims to compare three popular machine learning (ML) algorithms including random forest (RF), boosting regression tree (BRT), and multinomial logistic regression (MnLR) for spatial prediction of groundwater quality classes and mapping it for salinity hazard. Three hundred eighty-six groundwater samples were collected from an agriculturally intensive area in Fars Province, Iran, and nine hydro-chemical parameters were defined and interpreted. Variance inflation factor and Pearson’s correlations were used to check collinearity between variables. Thereinafter, the performance of ML models was evaluated by statistical indices, namely, overall accuracy (OA) and Kappa index obtained from the confusion matrix. The results showed that the RF model was more accurate than other models with the slight difference. Moreover, the analysis of relative importance also indicated that sodium adsorption ratio (SAR) and pH have the most impact parameters in explaining groundwater quality classes, respectively. In this research, applied ML algorithms along with the hydro-chemical parameters affecting the quality of ground water can lead to produce spatial distribution maps with high accuracy for managing irrigation practice.
Details
Groundwater quality;
Groundwater resources;
Water sampling;
Groundwater irrigation;
Sodium;
Data mining;
Machine learning;
Irrigation;
Salinity;
Water resources;
Spatial distribution;
Groundwater;
Accuracy;
Algorithms;
Mathematical models;
Regression analysis;
Collinearity;
Environmental science;
Maps;
Hydrochemicals;
Data analysis;
Water analysis;
Statistical analysis;
Water quality;
Environmental monitoring
1 University of Tehran, Tehran, Iran (GRID:grid.46072.37) (ISNI:0000 0004 0612 7950)
2 University of Zanjan, Water Engineering Department, Faculty of Agriculture, Zanjan, Iran (GRID:grid.412673.5) (ISNI:0000 0004 0382 4160)