Content area
Air quality significantly affects human health, productivity, and overall well-being. This study applies machine learning techniques to analyse and predict air quality in Hamilton, New Zealand, focusing on particulate matter (PM2.5 and PM10) and environmental factors such as temperature, humidity, wind speed, and wind direction. Data were collected from two monitoring sites (Claudelands and Rotokauri) to explore relationships between variables and evaluate the performance of different predictive models. First, the unsupervised k-means clustering algorithm was used to categorise air quality levels based on data from one or both locations. These cluster labels were then used as target variables in supervised learning models, including random forests, decision trees, support vector machines, and k-nearest neighbours. Model performance was assessed by comparing prediction accuracy for air quality at either Claudelands or Rotokauri. Results show that the random forest (93.6%) and decision tree (91.8%) models outperformed k-nearest neighbours (KNN, 83%) and support vector machine (SVM, 61%) in predicting air quality clusters derived from k-means analysis. The three clusters (very good, good, and moderate) reflected seasonal and urban–semi-urban gradients, while cross-location validation confirmed that models trained at Claudelands generalised effectively to Rotokauri, demonstrating scalability for regional air quality forecasting. These findings highlight the potential of combining clustering with supervised learning to improve air quality predictions. Such methods could support environmental monitoring and inform strategies for mitigating pollution-related health risks in New Zealand cities and beyond.
Details
Indoor air quality;
Environmental monitoring;
Performance evaluation;
Wind speed;
Supervised learning;
Aerosols;
Air pollution;
Machine learning;
Prediction models;
Decision trees;
Clustering;
Public health;
Algorithms;
Environmental factors;
Health risks;
Accuracy;
Pollutants;
Deep learning;
Regression analysis;
Trends;
Forecasting;
Air quality;
Outdoor air quality;
Learning algorithms;
Particulate emissions;
Particulate matter;
Cluster analysis;
Support vector machines;
Urban areas;
Cities;
Vector quantization;
Well being
; Chand Praneel 2
; Al-Rawi, Mohammad 3
1 School of Computing, Eastern Institute of Technology, Napier 4112, New Zealand; [email protected]
2 Sydney International School of Technology and Commerce, Sydney, NSW 2000, Australia; [email protected]
3 School of Computing, Mathematics and Engineering, Charles Sturt University, Bathurst, NSW 2795, Australia