Content area
Full text
ChatGPT (also known as Chat Generative Pretrained Transformer) is a trending artificial intelligence (AI) tool developed by OpenAI [1]. It was first launched in November 2022 based on OpenAI’s GPT-3.5 [2], followed by the second release shortly in March 2023 based on GPT-4.0 [3]. Two months after its first release, the number of active users per month reached over 100 million, making ChatGPT the fastest-growing consumer application ever [4]. Technically, ChatGPT is a large language model–based chatbot that performs specific natural language processing tasks. For people who believe in deep learning technologies, they will immediately tell from the name of ChatGPT that the powerfulness of this tool is largely attributed to the attention model introduced by a group of Google researchers in 2017 [5]; however, even among users who are new to AI, ChatGPT is still well accepted as its user interface is straightforward with all the complex technical details hidden. More importantly, it is almost omnipotent in terms of answering a wide variety of questions like a knowledgeable human being most of the time.
Training ChatGPT for its versatility and powerfulness is not cheap. According to miscellaneous information sources, OpenAI originally used ~40 GB of text data to train the early GPT model with 8 NVIDIA V100 GPUs and 256 GB of RAM. To train GPT-3, which laid the foundation for ChatGPT, the 2016-2019 Common Crawl data set [6] of 45 TB of compressed plain text was used. Nowadays the data set used for training ChatGPT consists of more than 145 million dialogues scraped from various social media and online knowledge bases (eg, Twitter, Reddit, and Wikipedia). Note that it is also expensive to clean up such text data as spam, offensive language, low-quality content, and so on need to be removed before they can be fed to ChatGPT. The typical hardware configuration for training ChatGPT may include 64 or more NVLink-connected NVIDIA V100 GPUs with 32 GB of memory each, and each round of training may take 2...