Content area

Abstract

While raw images possess distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels, they are not widely adopted by general users due to their substantial storage requirements. Very recent studies propose to compress raw images by designing sampling masks within the pixel space of the raw image. However, these approaches often leave space for pursuing more effective image representations and compact metadata. In this work, we propose a novel framework that learns a compact representation in the latent space, serving as metadata, in an end-to-end manner. Compared with lossy image compression, we analyze the intrinsic difference of the raw image reconstruction task caused by rich information from the sRGB image. Based on the analysis, a novel design of the backbone with asymmetric and hybrid spatial feature resolutions is proposed, which significantly improves the rate-distortion performance. Besides, we propose a novel design of the sRGB-guided context model, which can better predict the order masks of encoding/decoding based on both the sRGB image and the the masks of already processed features. Benefited from the better modeling of the correlation between order masks, the already processed information can be better utilized. Moreover, a novel sRGB-guided adaptive quantization precision strategy, which dynamically assigns varying levels of quantization precision to different regions, further enhances the representation ability of the model. Finally, based on the iterative properties of the proposed context model, we propose a novel strategy to achieve variable bit rates using a single model. This strategy allows for the continuous convergence of a wide range of bit rates. We demonstrate how our raw image compression scheme effectively allocates more bits to image regions that hold greater global importance. Extensive experimental results validate the superior performance of the proposed method, achieving high-quality raw image reconstruction with a smaller metadata size, compared with existing SOTA methods.

Details

10000008
Title
Beyond Learned Metadata-Based Raw Image Reconstruction
Author
Wang, Yufei 1 ; Yu, Yi 1 ; Yang, Wenhan 2 ; Guo, Lanqing 1 ; Chau, Lap-Pui 3 ; Kot, Alex C. 1 ; Wen, Bihan 1   VIAFID ORCID Logo 

 Nanyang Technological University, ROSE Lab, Singapore, Singapore (GRID:grid.59025.3b) (ISNI:0000 0001 2224 0361) 
 PengCheng Laboratory, Shenzhen, China (GRID:grid.508161.b) 
 The Hong Kong Polytechnic University, Department of Electronic and Information Engineering, Hong Kong, China (GRID:grid.16890.36) (ISNI:0000 0004 1764 6123) 
Publication title
Volume
132
Issue
12
Pages
5514-5533
Publication year
2024
Publication date
Dec 2024
Publisher
Springer Nature B.V.
Place of publication
New York
Country of publication
Netherlands
Publication subject
ISSN
09205691
e-ISSN
15731405
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2024-06-17
Milestone dates
2024-05-31 (Registration); 2023-06-03 (Received); 2024-05-31 (Accepted)
Publication history
 
 
   First posting date
17 Jun 2024
ProQuest document ID
3128897467
Document URL
https://www.proquest.com/scholarly-journals/beyond-learned-metadata-based-raw-image/docview/3128897467/se-2?accountid=208611
Copyright
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Last updated
2024-12-11
Database
ProQuest One Academic