Content area

Abstract

Gunshot sound classification plays a crucial role in public safety, forensic investigations, and intelligent surveillance systems. This study evaluates the performance of deep learning models in classifying firearm sounds by analyzing twelve time–frequency spectrogram representations, including Mel, Bark, MFCC, CQT, Cochleagram, STFT, FFT, Reassigned, Chroma, Spectral Contrast, and Wavelet. The dataset consists of 2148 gunshot recordings from four firearm types, collected in a semi-controlled outdoor environment under multi-orientation conditions. To leverage advanced computer vision techniques, all spectrograms were converted into RGB images using perceptually informed colormaps. This enabled the application of image processing approaches and fine-tuning of pre-trained Convolutional Neural Networks (CNNs) originally developed for natural image classification. Six CNN architectures—ResNet18, ResNet50, ResNet101, GoogLeNet, Inception-v3, and InceptionResNetV2—were trained on these spectrogram images. Experimental results indicate that CQT, Cochleagram, and Mel spectrograms consistently achieved high classification accuracy, exceeding 94% when paired with deep CNNs such as ResNet101 and InceptionResNetV2. These findings demonstrate that transforming time–frequency features into RGB images not only facilitates the use of image-based processing but also allows deep models to capture rich spectral–temporal patterns, providing a robust framework for accurate firearm sound classification.

Details

1009240
Business indexing term
Title
Deep Spectrogram Learning for Gunshot Classification: A Comparative Study of CNN Architectures and Time-Frequency Representations
Author
Pafan, Doungpaisan 1   VIAFID ORCID Logo  ; Peerapol, Khunarsa 2   VIAFID ORCID Logo 

 Faculty of Industrial Technology and Management, King Mongkut’s University of Technology North Bangkok, Bangkok 10800, Thailand; [email protected] 
 Faculty of Science and Technology, Uttaradit Rajabhat University, Uttaradit 53000, Thailand 
Publication title
Volume
11
Issue
8
First page
281
Number of pages
27
Publication year
2025
Publication date
2025
Publisher
MDPI AG
Place of publication
Basel
Country of publication
Switzerland
e-ISSN
2313433X
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-08-21
Milestone dates
2025-07-05 (Received); 2025-08-02 (Accepted)
Publication history
 
 
   First posting date
21 Aug 2025
ProQuest document ID
3244042760
Document URL
https://www.proquest.com/scholarly-journals/deep-spectrogram-learning-gunshot-classification/docview/3244042760/se-2?accountid=208611
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-08-29
Database
ProQuest One Academic