Full text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Rapid advancements in remote sensing (RS) imaging technology have heightened the demand for the precise and efficient interpretation of large-scale, high-resolution RS images. Although segmentation algorithms based on convolutional neural networks (CNNs) or Transformers have achieved significant performance improvements, the trade-off between segmentation precision and computational complexity remains a key limitation for practical applications. Therefore, this paper proposes CVMH-UNet—a hybrid semantic segmentation network that integrates the Vision Mamba (VMamba) framework with multi-scale feature fusion—to achieve high-precision and relatively efficient RS image segmentation. CVMH-UNet comprises the following two core modules: the hybrid visual state space block (HVSSBlock) and the multi-frequency multi-scale feature fusion block (MFMSBlock). The HVSSBlock integrates convolutional branches to enhance local feature extraction while employing a cross 2D scanning method (CS2D) to capture global information from multiple directions, enabling the synergistic modeling of global and local features. The MFMSBlock introduces multi-frequency information via 2D Discrete Cosine Transform (2D DCT) and extracts multi-scale local details through point-wise convolution, thereby optimizing refined feature fusion in skip connections between the encoder and decoder. Experimental results on benchmark RS datasets demonstrate that CVMH-UNet achieves state-of-the-art segmentation accuracy with optimal computational efficiency, surpassing existing advanced methods.

Details

Title
Remote Sensing Image Segmentation Using Vision Mamba and Multi-Scale Multi-Frequency Feature Fusion
Author
Cao Yice 1   VIAFID ORCID Logo  ; Liu, Chenchen 1   VIAFID ORCID Logo  ; Wu, Zhenhua 1   VIAFID ORCID Logo  ; Zhang, Lei 2   VIAFID ORCID Logo  ; Yang, Lixia 1   VIAFID ORCID Logo 

 School of Electronics and Information Engineering, Anhui University, Hefei 230601, China; [email protected] (Y.C.); [email protected] (C.L.); [email protected] (L.Y.) 
 School of Electronics and Communication Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China; [email protected] 
First page
1390
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
20724292
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3194640128
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.