Full Text

Turn on search term navigation

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

The growing demand for advanced tools to ensure safety in railway construction projects highlights the need for systems that can smoothly integrate and analyze multiple data modalities, such as multimodal learning algorithms. The latter, inspired by the human brain’s ability to integrate many sensory inputs, has emerged as a promising field in artificial intelligence. In light of this, there has been a rise in research on multimodal fusion approaches, which have the potential to outperform standard unimodal solutions. However, the integration of multiple data sources presents significant challenges to be addressed. This work attempts to apply multimodal learning to detect dangerous actions using RGB-D inputs. The key contributions include the evaluation of various fusion strategies and modality encoders, as well as identifying the most effective methods for capturing complex cross-modal interactions. The superior performance of the MultConcat multimodal fusion method was demonstrated, achieving an accuracy of 89.3%. Results also underscore the critical need for robust modality encoders and advanced fusion techniques to outperform unimodal solutions.

Details

Title
Comparison Analysis of Multimodal Fusion for Dangerous Action Recognition in Railway Construction Sites
Author
Otmane Amel 1   VIAFID ORCID Logo  ; Siebert, Xavier 2 ; Sidi Ahmed Mahmoudi 1   VIAFID ORCID Logo 

 ILIA Lab, Faculty of Engineering, University of Mons, 7000 Mons, Belgium; [email protected] 
 Department of Mathematics and Operational Research, University of Mons, 7000 Mons, Belgium; [email protected] 
First page
2294
Publication year
2024
Publication date
2024
Publisher
MDPI AG
e-ISSN
20799292
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3072317368
Copyright
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.