Full Text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

In this study, we present a novel concept termed open-vocabulary domain generalization (OVDG), which we investigate within the context of semantic segmentation. OVDG presents greater difficulty compared to conventional domain generalization, yet it offers greater practicality. It jointly considers (1) recognizing both base and novel classes and (2) generalizing to unseen domains. In OVDG, only the labels of base classes and the images from source domains are available to learn a robust model. Then, the model could be generalized to images from novel classes and target domains directly. In this paper, we propose a dual-branch FreeMix module to implement the OVDG task effectively in a universal framework: the base segmentation branch (BSB) and the entity segmentation branch (ESB). First, the entity mask is introduced as a novel concept for segmentation generalization. Additionally, semantic logits are learned for both the base mask and the entity mask, enhancing the diversity and completeness of masks for both base classes and novel classes. Second, the FreeMix utilizes pretrained self-supervised learning on large-scale remote-sensing data (RS_SSL) to extract domain-agnostic visual features for decoding masks and semantic logits. Third, a training tactic called dataset-aware sampling (DAS) is introduced for multi-source domain learning, aimed at improving the overall performance. In summary, RS_SSL, ESB, and DAS can significantly improve the generalization ability of the model on both a class level and a domain level. Experiments demonstrate that our method produces state-of-the-art results on several remote-sensing semantic-segmentation datasets, including Potsdam, GID5, DeepGlobe, and URUR, for OVDG.

Details

Title
FreeMix: Open-Vocabulary Domain Generalization of Remote-Sensing Images for Semantic Segmentation
Author
Wu, Jingyi 1   VIAFID ORCID Logo  ; Shi Jingye 2 ; Zhao Zeyong 1   VIAFID ORCID Logo  ; Liu, Ziyang 1 ; Ruicong, Zhi 1 

 School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China; [email protected] (J.W.); [email protected] (Z.Z.); [email protected] (Z.L.), Beijing Key Laboratory of Knowledge Engineering for Material Science, Beijing 100083, China 
 Beijing Key Laboratory of Security and Privacy in Intelligent Transportation, Beijing Jiaotong University, Beijing 100044, China; [email protected] 
First page
1357
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
20724292
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3194640157
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.