Content area
Full Text
Introduction
Large language models, equipped with powerful natural language processing capabilities, have demonstrated impressive applications across diverse fields, highlighting their potential as valuable assistive tools (Oermann and Kondziolka, 2023). By transforming text descriptions into visual outputs, these models represent a significant advancement in AI’s capacity to interpret and visualize abstract concepts (Driessen et al. 2024, Jang et al. 2024, Riemer and Peter 2024, Vemprala et al. 2024). Since their inception, large language models have been applied extensively in areas such as education, healthcare, and software development (Xue et al. 2023, Hu et al. 2024, Vemprala et al. 2024). Research shows that large language models can generate illustrative images from text descriptions, aiding human creativity in art and design (Lu et al. 2023). For example, DALL-E, built on a transformer architecture, generates highly detailed images, showcasing AI’s creative potential in design (Ali et al. 2024), while Midjourney enables users to explore imaginative visual scenes (Javan and Mostaghni, 2024). Building on these developments, ChatGPT-4o incorporates multimodal functionality, enabling it to generate images of futuristic urban landscapes from specific text prompts, showcasing unique potential in urban planning and design (Fu, 2024).
ChatGPT-4o’s image generation capability relies on an extensive database and robust computational capacity, enabling it to produce complex future city images from textual instructions. ChatGPT-4o processes large volumes of textual and visual data, with its database covering a broad range of urban design elements (Peng et al. 2023, Caprotti et al. 2024). The high quality of this data directly impacts the model’s ability to generate detailed, accurate images (Driessen et al. 2024). Additionally, supported by deep learning algorithms, ChatGPT-4o’s generation process includes multi-layered language comprehension, image analysis, and cross-modal integration (Wang et al. 2024). Leveraging efficient deep learning algorithms and computational power, ChatGPT-4o rapidly processes and integrates detailed data, converting textual instructions into concrete visualizations of future cities (Cugurullo et al. 2024). This functionality not only enhances creativity in urban planning but also provides visual representations of future cities, supporting human aesthetic evaluation and design feedback.
Although artificial intelligence has made significant strides in technical fields like data analysis and predictive modelling, its potential as a creative tool in urban planning remains underexplored. Current research primarily focuses on AI’s strengths in data processing and efficiency optimization (Oermann...