Content area

Abstract

Text-to-image generation is a challenging task. Although diffusion models can generate high-quality images of complex scenes, they sometimes suffer from a lack of realism. Additionally, there is often a large diversity among images generated from different text with the same semantics. Furthermore, the generation of details is sometimes insufficient. Generative adversarial networks can generate realism images. These images are consistent with the text descriptions. And the networks can generate content-consistent images. In this paper, we argue that generating images that are more consistent with the text descriptions is more important than generating higher-quality images. Therefore, this paper proposes the pretrained model-based generative adversarial network (PMGAN). PMGAN utilizes multiple pre-trained models in both generator and discriminator. Specifically, in the generator, the deep attentional multimodal similarity model text encoder extracts word and sentence embeddings from the input text, and the contrastive language-image pre-training (CLIP) text encoder extracts initial image features from the input text. In the discriminator, a pre-trained CLIP image encoder extracts image features from the input image. The CLIP encoder can map text and images into a common semantic space, which is beneficial to generate high-quality images. Experimental results show that compared to the state-of-the-art methods, PMGAN achieves better scores on both inception score and Fréchet inception distance and can produce higher quality images while maintaining greater consistency with text descriptions.

Details

Title
PMGAN: pretrained model-based generative adversarial network for text-to-image generation
Publication title
Volume
41
Issue
1
Pages
303-314
Publication year
2025
Publication date
Jan 2025
Publisher
Springer Nature B.V.
Place of publication
Heidelberg
Country of publication
Netherlands
Publication subject
ISSN
01782789
e-ISSN
14322315
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2024-03-28
Milestone dates
2024-02-22 (Registration); 2024-02-20 (Accepted)
Publication history
 
 
   First posting date
28 Mar 2024
ProQuest document ID
3159547532
Document URL
https://www.proquest.com/scholarly-journals/pmgan-pretrained-model-based-generative/docview/3159547532/se-2?accountid=208611
Copyright
Copyright Springer Nature B.V. Jan 2025
Last updated
2025-01-25
Database
ProQuest One Academic