Full Text

Turn on search term navigation

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Real-time remote sensing segmentation technology is crucial for unmanned aerial vehicles (UAVs) in battlefield surveillance, land characterization observation, earthquake disaster assessment, etc., and can significantly enhance the application value of UAVs in military and civilian fields. To realize this potential, it is essential to develop real-time semantic segmentation methods that can be applied to resource-limited platforms, such as edge devices. The majority of mainstream real-time semantic segmentation methods rely on convolutional neural networks (CNNs) and transformers. However, CNNs cannot effectively capture long-range dependencies, while transformers have high computational complexity. This paper proposes a novel remote sensing Mamba architecture for real-time segmentation tasks in remote sensing, named RTMamba. Specifically, the backbone utilizes a Visual State-Space (VSS) block to extract deep features and maintains linear computational complexity, thereby capturing long-range contextual information. Additionally, a novel Inverted Triangle Pyramid Pooling (ITP) module is incorporated into the decoder. The ITP module can effectively filter redundant feature information and enhance the perception of objects and their boundaries in remote sensing images. Extensive experiments were conducted on three challenging aerial remote sensing segmentation benchmarks, including Vaihingen, Potsdam, and LoveDA. The results show that RTMamba achieves competitive performance advantages in terms of segmentation accuracy and inference speed compared to state-of-the-art CNN and transformer methods. To further validate the deployment potential of the model on embedded devices with limited resources, such as UAVs, we conducted tests on the Jetson AGX Orin edge device. The experimental results demonstrate that RTMamba achieves impressive real-time segmentation performance.

Details

Title
A Novel Mamba Architecture with a Semantic Transformer for Efficient Real-Time Remote Sensing Semantic Segmentation
Author
Ding, Hao 1   VIAFID ORCID Logo  ; Xia, Bo 1 ; Liu, Weilin 1 ; Zhang, Zekai 2 ; Zhang, Jinglin 3   VIAFID ORCID Logo  ; Wang, Xing 1 ; Sen, Xu 4 

 School of Information Science and Engineering, Linyi University, Linyi 276000, China; [email protected] (H.D.); [email protected] (B.X.); [email protected] (W.L.); [email protected] (X.W.) 
 Department of Control Science and Engineering, Shandong University, Jinan 250061, China; [email protected] 
 School of Information Science and Engineering, Linyi University, Linyi 276000, China; [email protected] (H.D.); [email protected] (B.X.); [email protected] (W.L.); [email protected] (X.W.); Department of Control Science and Engineering, Shandong University, Jinan 250061, China; [email protected]; Department of Information Science and Engineering, Shandong Research Institute of Industrial Technology, Jinan 250100, China 
 School of Information Engineering, Yancheng Institute of Technology, Yancheng 224051, China; [email protected] 
First page
2620
Publication year
2024
Publication date
2024
Publisher
MDPI AG
e-ISSN
20724292
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3085010024
Copyright
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.