Content area

Abstract

What are the main findings?

Our proposed UniFusOD method integrates infrared-visible image fusion and object detection into a unified, end-to-end framework, achieving superior performance across multiple tasks.

The introduction of the Fine-Grained Region Attention (FRA) module and UnityGrad optimization significantly enhances the model’s ability to handle multi-scale features and resolves gradient conflicts, improving both fusion and detection outcomes.

What are the implications of the main findings?

The unified optimization approach not only improves image fusion quality but also enhances downstream task performance, particularly in detecting rotated and small objects.

This approach demonstrates significant robustness across various datasets, offering a promising solution for multimodal perception tasks in remote sensing and autonomous driving.

Infrared-visible image fusion and object detection are crucial components in remote sensing applications, each offering unique advantages. Recent research has increasingly sought to combine these tasks to enhance object detection performance. However, the integration of these tasks presents several challenges, primarily due to two overlooked issues: (i) existing infrared-visible image fusion methods often fail to adequately focus on fine-grained or dense information, and (ii) while joint optimization methods can improve fusion quality and downstream task performance, their multi-stage training processes often reduce efficiency and limit the network’s global optimization capability. To address these challenges, we propose the UniFusOD method, an efficient end-to-end framework that simultaneously optimizes both infrared-visible image fusion and object detection tasks. The method integrates Fine-Grained Region Attention (FRA) for region-specific attention operations at different granularities, enhancing the model’s ability to capture complex information. Furthermore, UnityGrad is introduced to balance the gradient conflicts between fusion and detection tasks, stabilizing the optimization process. Extensive experiments demonstrate the superiority and robustness of our approach. Not only does UniFusOD achieve excellent results in image fusion, but it also provides significant improvements in object detection performance. The method exhibits remarkable robustness across various tasks, achieving a 0.8 and 1.9 mAP50 improvement over state-of-the-art methods on the DroneVehicle dataset for rotated object detection and the M3FD dataset for horizontal object detection, respectively.

Details

1009240
Title
Infrared-Visible Image Fusion Meets Object Detection: Towards Unified Optimization for Multimodal Perception
Author
Xiantai, Xiang 1 ; Zhou Guangyao 1 ; Niu, Ben 2 ; Pan Zongxu 3   VIAFID ORCID Logo  ; Huang, Lijia 1 ; Li Wenshuai 1 ; Wen Zixiao 1   VIAFID ORCID Logo  ; Qi Jiamin 1 ; Gao Wanxin 4 

 Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China; [email protected] (X.X.); [email protected] (Z.W.);, School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China, Key Laboratory of Target Cognition and Application Technology (TCAT), Beijing 100190, China 
 Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China; [email protected] (X.X.); [email protected] (Z.W.);, Key Laboratory of Target Cognition and Application Technology (TCAT), Beijing 100190, China 
 School of Software Engineering, Xi’an Jiaotong University, Xi’an 710049, China; [email protected] 
 School of Automation, Beijing Institute of Technology, Beijing 100081, China 
Publication title
Volume
17
Issue
21
First page
3637
Number of pages
27
Publication year
2025
Publication date
2025
Publisher
MDPI AG
Place of publication
Basel
Country of publication
Switzerland
Publication subject
e-ISSN
20724292
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-11-04
Milestone dates
2025-10-10 (Received); 2025-11-03 (Accepted)
Publication history
 
 
   First posting date
04 Nov 2025
ProQuest document ID
3271544926
Document URL
https://www.proquest.com/scholarly-journals/infrared-visible-image-fusion-meets-object/docview/3271544926/se-2?accountid=208611
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-11-13
Database
ProQuest One Academic