Full text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Two challenges in computer vision (CV) related to face detection are the difficulty of acquisition in the target domain and the degradation of image quality. Especially in low-light situations, the poor visibility of images is difficult to label, which results in detectors trained under well-lit conditions exhibiting reduced performance in low-light environments. Conventional works image enhancement and object detection techniques are unable to resolve the inherent difficulties in collecting and labeling low-light images. The Dark-Illuminated Network with Contrastive Language–Image Pretraining (CLIP) and Self-Supervised Vision Transformer (Dino), abbreviated as DAl-CLIP-Dino is proposed to address the degradation of object detection performance in low-light environments and achieve zero-shot day–night domain adaptation. Specifically, an advanced reflectance representation learning module (which leverages Retinex decomposition to extract reflectance and illumination features from both low-light and well-lit images) and an interchange–redecomposition coherence process (which performs a second decomposition on reconstructed images after the exchange to generate a second round of reflectance and illumination predictions while validating their consistency using redecomposition consistency loss) are employed to achieve illumination invariance and enhance model performance. CLIP (VIT-based image encoder part) and Dino have been integrated for feature extraction, improving performance under extreme lighting conditions and enhancing its generalization capability. Our model achieves a mean average precision (mAP) of 29.6% for face detection on the DARK FACE dataset, outperforming other models in zero-shot domain adaptation for face detection.

Details

Title
Zero-Shot Day–Night Domain Adaptation for Face Detection Based on DAl-CLIP-Dino
Author
Sun, Huadong; Liu, Yinghui; Chen, Ziyang; Zhang, Pengyi  VIAFID ORCID Logo 
First page
143
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
20799292
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3153800247
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.