Full text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

This study presents a comparative analysis of four topic modeling techniques —Latent Dirichlet Allocation (LDA), Bidirectional Encoder Representations from Transformers (BERT), Probabilistic Latent Semantic Analysis (pLSA), and Non-negative Matrix Factorization (NMF)—applied to aviation safety reports from the ATSB dataset spanning 2013–2023. The evaluation focuses on coherence, interpretability, generalization, computational efficiency, and scalability. The results indicate that NMF achieves the highest coherence score (0.7987), demonstrating its effectiveness in extracting well-defined topics from structured narratives. pLSA performs competitively (coherence: 0.7634) but lacks the scalability of NMF. LDA and BERTopic, while effective in generalization (perplexity: −6.471 and −4.638, respectively), struggle with coherence due to their probabilistic nature and reliance on contextual embeddings. A preliminary expert review by two aviation safety specialists found that topics generated by the NMF model were interpretable and aligned well with domain knowledge, reinforcing its potential suitability for such aviation safety analysis. Future research should explore new hybrid modeling approaches and real-time applications to enhance aviation safety analysis further. The study contributes to advancing automated safety monitoring in the aviation industry by refining the most appropriate topic modeling techniques.

Details

Title
Does the Choice of Topic Modeling Technique Impact the Interpretation of Aviation Incident Reports? A Methodological Assessment
Author
Aziida, Nanyonga 1   VIAFID ORCID Logo  ; Joiner, Keith 2   VIAFID ORCID Logo  ; Turhan Ugur 3 ; Wild, Graham 3   VIAFID ORCID Logo 

 School of Engineering and Technology, University of New South Wales, Canberra, ACT 2600, Australia; [email protected] 
 Capability Systems Centre, University of New South Wales, Canberra, ACT 2610, Australia; [email protected] 
 School of Science, University of New South Wales, Canberra, ACT 2612, Australia; [email protected] 
First page
209
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
22277080
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3212133880
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.