Full Text

Turn on search term navigation

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Vision transformers (ViTs) are increasingly utilized for HSI classification due to their outstanding performance. However, ViTs encounter challenges in capturing global dependencies among objects of varying sizes, and fail to effectively exploit the spatial–spectral information inherent in HSI. In response to this limitation, we propose a novel solution: the multi-scale spatial–spectral transformer (MSST). Within the MSST framework, we introduce a spatial–spectral token generator (SSTG) and a token fusion self-attention (TFSA) module. Serving as the feature extractor for the MSST, the SSTG incorporates a dual-branch multi-dimensional convolutional structure, enabling the extraction of semantic characteristics that encompass spatial–spectral information from HSI and subsequently tokenizing them. TFSA is a multi-head attention module with the ability to encode attention to features across various scales. We integrated TFSA with cross-covariance attention (CCA) to construct the transformer encoder (TE) for the MSST. Utilizing this TE to perform attention modeling on tokens derived from the SSTG, the network effectively simulates global dependencies among multi-scale features in the data, concurrently making optimal use of spatial–spectral information in HSI. Finally, the output of the TE is fed into a linear mapping layer to obtain the classification results. Experiments conducted on three popular public datasets demonstrate that the MSST method achieved higher classification accuracy compared to state-of-the-art (SOTA) methods.

Details

Title
A Spatial–Spectral Transformer for Hyperspectral Image Classification Based on Global Dependencies of Multi-Scale Features
Author
Ma, Yunxuan 1 ; Yan, Lan 1 ; Xie, Yakun 2 ; Yu, Lanxin 3 ; Chen, Chen 1 ; Wu, Yusong 1 ; Dai, Xiaoai 1 

 College of Earth Sciences, Chengdu University of Technology, Chengdu 610059, China; [email protected] (Y.M.); [email protected] (C.C.); [email protected] (Y.W.); [email protected] (X.D.) 
 Faculty of Geosciences and Environmental Engineering, Southwest Jiaotong University, Chengdu 610097, China; [email protected] 
 School of Statistics, East China Normal University, Shanghai 200062, China; [email protected] 
First page
404
Publication year
2024
Publication date
2024
Publisher
MDPI AG
e-ISSN
20724292
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2918797392
Copyright
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.