Content area

Abstract

Air quality significantly affects human health, productivity, and overall well-being. This study applies machine learning techniques to analyse and predict air quality in Hamilton, New Zealand, focusing on particulate matter (PM2.5 and PM10) and environmental factors such as temperature, humidity, wind speed, and wind direction. Data were collected from two monitoring sites (Claudelands and Rotokauri) to explore relationships between variables and evaluate the performance of different predictive models. First, the unsupervised k-means clustering algorithm was used to categorise air quality levels based on data from one or both locations. These cluster labels were then used as target variables in supervised learning models, including random forests, decision trees, support vector machines, and k-nearest neighbours. Model performance was assessed by comparing prediction accuracy for air quality at either Claudelands or Rotokauri. Results show that the random forest (93.6%) and decision tree (91.8%) models outperformed k-nearest neighbours (KNN, 83%) and support vector machine (SVM, 61%) in predicting air quality clusters derived from k-means analysis. The three clusters (very good, good, and moderate) reflected seasonal and urban–semi-urban gradients, while cross-location validation confirmed that models trained at Claudelands generalised effectively to Rotokauri, demonstrating scalability for regional air quality forecasting. These findings highlight the potential of combining clustering with supervised learning to improve air quality predictions. Such methods could support environmental monitoring and inform strategies for mitigating pollution-related health risks in New Zealand cities and beyond.

Details

1009240
Business indexing term
Title
A Two-Stage Machine Learning Framework for Air Quality Prediction in Hamilton, New Zealand
Author
Alani, Noor H, S 1   VIAFID ORCID Logo  ; Chand Praneel 2   VIAFID ORCID Logo  ; Al-Rawi, Mohammad 3   VIAFID ORCID Logo 

 School of Computing, Eastern Institute of Technology, Napier 4112, New Zealand; [email protected] 
 Sydney International School of Technology and Commerce, Sydney, NSW 2000, Australia; [email protected] 
 School of Computing, Mathematics and Engineering, Charles Sturt University, Bathurst, NSW 2795, Australia 
Publication title
Volume
12
Issue
9
First page
336
Number of pages
29
Publication year
2025
Publication date
2025
Publisher
MDPI AG
Place of publication
Basel
Country of publication
Switzerland
Publication subject
e-ISSN
20763298
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-09-20
Milestone dates
2025-08-17 (Received); 2025-09-16 (Accepted)
Publication history
 
 
   First posting date
20 Sep 2025
ProQuest document ID
3254506486
Document URL
https://www.proquest.com/scholarly-journals/two-stage-machine-learning-framework-air-quality/docview/3254506486/se-2?accountid=208611
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-12-04
Database
2 databases
  • Coronavirus Research Database
  • ProQuest One Academic