Abstract

Translate

In this paper, we propose MSA-ESRGAN, a novel super-resolution model designed to enhance the perceptual quality of images. The key innovation of our approach lies in the integration of a multi-scale attention U-Net discriminator, which allows for more accurate differentiation between subject and background areas in images. By leveraging this architecture, MSA-ESRGAN surpasses traditional methods and several state-of-the-art super-resolution models in terms of Natural Image Quality Evaluator (NIQE) scores as well as Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) across various benchmark datasets, including BSD100, Set5, Set14, Urban100, and OST300. Additionally, subjective evaluations further confirm the enhanced visual quality delivered by MSA-ESRGAN, particularly in terms of preserving texture and overall image realism. To ensure a fair comparison with Real-ESRGAN, we initialized our generator with a pre-trained Real-ESRNET model and followed the same training setup. Our model was trained on the DIV2K dataset using high-resolution image patches and the Adam optimizer, incorporating exponential moving average (EMA) for stability and performance enhancement. Evaluations on multiple benchmark datasets demonstrate that MSA-ESRGAN consistently delivers superior perceptual quality, as evidenced by higher NIQE, PSNR, and SSIM scores compared to other methods. Specifically, our model shows significant improvements in both objective and subjective measures of image quality. Furthermore, an ablation study highlighted the critical role of our multi-scale attention U-Net discriminator in enhancing the model’s performance. The results underscore the effectiveness of MSA-ESRGAN in maintaining image naturalness and perceptual quality, providing a robust benchmark for blind super-resolution tasks.

Details

Title

Training ESRGAN with multi-scale attention U-Net discriminator

Author

Chen, Quan¹; Li, Hao¹; Lu, Gehao¹

¹ Yunnan University, School of Information Science and Engineering, Kunming, China (GRID:grid.440773.3) (ISNI:0000 0000 9342 2456)

Pages

29036

Publication year

2024

Publication date

2024

Publisher

Nature Publishing Group

e-ISSN

20452322

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1038/s41598-024-78813-5

ProQuest document ID

3132206653

© The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Training ESRGAN with multi-scale attention U-Net discriminator

Jump to:

Abstract

Details

Suggested sources