Full text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

What are the main findings?

Color camera images can be realistically and semantically reconstructed from multimodal LiDAR data using a GAN-based model.

The fusion of multiple LiDAR modalities enhances reconstruction quality, and the incorporation of a segmentation-based loss further improves the reconstruction fidelity.

What is the implication of the main finding?

LiDAR can serve as a backup to cameras by reconstructing semantically meaningful visual information, enhancing system redundancy and safety in autonomous driving.

LiGenCam has the potential to perform data augmentation by generating virtual camera viewpoints using panoramic LiDAR data.

The automotive industry is advancing toward fully automated driving, where perception systems rely on complementary sensors such as LiDAR and cameras to interpret the vehicle’s surroundings. For Level 4 and higher vehicles, redundancy is vital to prevent safety-critical failures. One way to achieve this is by using data from one sensor type to support another. While much research has focused on reconstructing LiDAR point cloud data using camera images, limited work has been conducted on the reverse process—reconstructing image data from LiDAR. This paper proposes a deep learning model, named LiDAR Generative Camera (LiGenCam), to fill this gap. The model reconstructs camera images by utilizing multimodal LiDAR data, including reflectance, ambient light, and range information. LiGenCam is developed based on the Generative Adversarial Network framework, incorporating pixel-wise loss and semantic segmentation loss to guide reconstruction, ensuring both pixel-level similarity and semantic coherence. Experiments on the DurLAR dataset demonstrate that multimodal LiDAR data enhances the realism and semantic consistency of reconstructed images, and adding segmentation loss further improves semantic consistency. Ablation studies confirm these findings.

Details

Title
LiGenCam: Reconstruction of Color Camera Images from Multimodal LiDAR Data for Autonomous Driving
Author
Xu Minghao 1   VIAFID ORCID Logo  ; Gu Yanlei 2   VIAFID ORCID Logo  ; Goncharenko Igor 3 ; Kamijo Shunsuke 4 

 Graduate School of Interdisciplinary Information Studies, The University of Tokyo, Tokyo 113-0033, Japan; [email protected] 
 Graduate School of Advanced Science and Engineering, Hiroshima University, Hiroshima 739-8527, Japan 
 College of Information Science and Engineering, Ritsumeikan University, Osaka 567-8570, Japan; [email protected] 
 Interfaculty Initiative in Information Studies, The University of Tokyo, Tokyo 113-0033, Japan; [email protected] 
First page
4295
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
14248220
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3233262011
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.