Content area
Full text
Correspondence to Mr Qiwei Wilton Sun; [email protected]
Introduction
Issued in December 2020, Executive Order 13960 established the US guidelines for the Trustworthy Use of Artificial Intelligence (AI) within federal agencies. Notably, this order does not extend to AI used in commercial applications, including new generative AI chatbots such as ChatGPT, that signal an imminent AI-driven paradigm shift across many professions. Since their inception, the prevailing focus of generative AI chatbots in healthcare has centred on their role as a clinical decision support tool. However, the potential applications of generative AI in creating clinical documentation, and the ethical considerations of using these tools in this context, have been underexplored. Generative AI-assisted clinical documentation herein refers to the process of summarising patient interactions into encounter notes or handoff reports, drafting discharge or after-visit summaries, generating supporting documents for processes such as prior authorisations, and related documents, but without independently initiating clinical decisions. Compared with generative AI-assisted clinical decision support, generative AI-assisted clinical documentation may be perceived as a more straightforward, less risky and readily actionable solution in practice, but the use of autonomous systems lacking ethical judgement to replicate human-like communication in sensitive healthcare tasks raises new and compounds old ethical challenges that may be overlooked. This article focuses on the use of generative AI chatbots for clinical documentation and offers a series of ethical considerations concerning health equity, clinician–patient relationships, and algorithmic transparency and integrity, with recommendations to mitigate these concerns.
Generative AI for clinical documentation and communication
Generative AI models are a subset of AI that can generate novel content, including text, images and other media not explicitly present in its existing training data. Text-based chatbots such as ChatGPT are user interfaces driven by language models—such as GPT-3, GPT-4 and later iterations—that employ algorithms to capture features and relationships in training data, enabling models to generate a wide range of content across various domains based on user prompts. A meta-analysis of studies examining the performance of GPT-3, GPT-3.5 and GPT-4 published between January and May 2023 estimated an integrated accuracy rate of approximately 56% in medically focused multiple-choice questions, with higher performance in internal medicine than surgical fields, but no differences found across model versions.1 However, its capabilities were notable for inconsistent accuracy,...





