Full text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Aviation safety analysis increasingly relies on extracting actionable insights from narrative incident reports to support risk identification and improve operational safety. Topic modeling techniques such as Probabilistic Latent Semantic Analysis (pLSA) and BERTopic offer automated methods to uncover latent themes in unstructured safety narratives. This study evaluates the effectiveness of each model in generating coherent, interpretable, and semantically meaningful topics for aviation safety practitioners and researchers. We assess model performance using both quantitative metrics (topic coherence scores) and qualitative evaluations of topic relevance. The findings show that while pLSA provides a solid probabilistic framework, BERTopic leveraging transformer-based embeddings and HDBSCAN clustering produces more nuanced, context-aware topic groupings, albeit with increased computational demands and tuning complexity. These results highlight the respective strengths and trade-offs of traditional versus modern topic modeling approaches in aviation safety analysis. This work advances the application of natural language processing (NLP) in aviation by demonstrating how topic modeling can support risk assessment, inform policy, and enhance safety outcomes.

Details

Title
Semantic Topic Modeling of Aviation Safety Reports: A Comparative Analysis Using BERTopic and PLSA
Author
Aziida, Nanyonga 1   VIAFID ORCID Logo  ; Joiner, Keith 2   VIAFID ORCID Logo  ; Turhan Ugur 3   VIAFID ORCID Logo  ; Wild, Graham 3   VIAFID ORCID Logo 

 School of Engineering and Technology, University of New South Wales, Canberra, ACT 2600, Australia; [email protected] 
 Capability Systems Centre, University of New South Wales, Canberra, ACT 2610, Australia; [email protected] 
 School of Science, University of New South Wales, Canberra, ACT 2612, Australia; [email protected] 
First page
551
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
22264310
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3223857794
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.