Full text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

The generation of images from scene graphs is an important area in computer vision, where structured object relationships are used to create detailed visual representations. While recent methods, such as generative adversarial networks (GANs), transformers, and diffusion models, have improved image quality, they still face challenges, like scalability issues, difficulty in generating complex scenes, and a lack of clear evaluation standards. Despite various approaches being proposed, there is still no unified way to compare their effectiveness, making it difficult to determine the best techniques for real-world applications. This review provides a detailed assessment of scene-graph-based image generation by organizing current methods into different categories and examining their advantages and limitations. We also discuss the datasets used for training, the evaluation measures applied to assess model performance, and the key challenges that remain, such as ensuring consistency in scene structure, handling object interactions, and reducing computational costs. Finally, we outline future directions in this field, highlighting the need for more efficient, scalable, and semantically accurate models. This review serves as a useful reference for researchers and practitioners, helping them understand current trends and identify areas for further improvement in scene-graph-based image generation.

Details

Title
Advancements, Challenges, and Future Directions in Scene-Graph-Based Image Generation: A Comprehensive Review
Author
Chikwendu Ijeoma Amuche 1   VIAFID ORCID Logo  ; Zhang, Xiaoling 1 ; Happy Nkanta Monday 2   VIAFID ORCID Logo  ; Nneji, Grace Ugochi 2   VIAFID ORCID Logo  ; Ukwuoma, Chiagoziem C 3   VIAFID ORCID Logo  ; Chikwendu, Okechukwu Chinedum 4 ; Gu, Yeong Hyeon 5   VIAFID ORCID Logo  ; Al-antari, Mugahed A 5   VIAFID ORCID Logo 

 School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China; [email protected] 
 Department of Computing, Oxford Brookes College, Chengdu University of Technology, Chengdu 610059, China; [email protected] (H.N.M.); [email protected] (G.U.N.); [email protected] (C.C.U.) 
 Department of Computing, Oxford Brookes College, Chengdu University of Technology, Chengdu 610059, China; [email protected] (H.N.M.); [email protected] (G.U.N.); [email protected] (C.C.U.); College of Nuclear Technology and Automation Engineering, Chengdu University of Technology, Chengdu 610059, China 
 Department of Biochemistry, Federal University of Technology Owerri, Ihiagwa, Owerri PMB 1526, Nigeria; [email protected] 
 Department of Artificial Intelligence and Data Science, College of AI Convergence, Daeyang AI Center, Sejong University, Seoul 05006, Republic of Korea 
First page
1158
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
20799292
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3181457755
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.