Artificial intelligence (AI) has been used in various applications for decades. When San Francisco-based company OpenAI introduced ChatGPT in November 2022, advanced AI became accessible to anyone with a computer or smartphone. This accessibility created a rush of new AI users across many professions and industries, including health professions education [1]. The application’s adoption was rapid, with it becoming the fastest application to reach 1 million users, hitting that mark in just 5 days [2].
AI, such as ChatGPT, can impact health professions education, including simulation. However, some issues should have big flashing WARNING signs with them.
On the plus side, AI can streamline simulation development. It can write objectives, summarize a case flow, create medication dosing recommendations or generate a debriefing outline [3]. It can also take an existing scenario and quickly change it to a new setting or a different learner group, adapting the language to meet the needs of the new situation [4]. What once took hours may now be accomplished in minutes.
Nevertheless, there are issues. Chief amongst these, AI, such as ChatGPT, makes mistakes. At its worst, those mistakes are not trivial, minor errors; they are fabrications passed off as truths by the program. Called ‘hallucinations’, these confabulations are convincing [5]. Of course, AI is not human; however, the wording of these incorrect responses displays a confidence that is very believable and can deceive the unsuspecting reader. Already, other professions have started seeing the impact of these hallucinations. This past June, a New York judge fined two lawyers for referencing cases that did not exist [6]. The lawyers used ChatGPT to help build their legal citations in a personal injury case. Once submitted to the judge, it was discovered that some of the cases were not real. They were AI hallucinations.
Understanding how this happens is complicated. We expect computers to be factually driven machines. When we enter 2 plus 2 in a calculator, we do not expect to see a result other than 4. However, generative AI programs such as ChatGPT and others like Google’s Bard generate their results differently [7]. As a group, these AI programs are classified as Large Language Models. They access the seemingly endless amount of information available on the Internet, whether true or not, and compile answers that mix all these sources into a single response. The key difference between generative AI and traditional AI is that traditional AI reports data within a rules-based structure, while generative AI takes this data and generates new viewpoints and predictions.
There are limitations to what it can access. Items found behind paywalls or subscription-based access sites are not accessible. The programs will then use secondary sources that reference the restricted material making any conclusions by the AI program dependent on the interpretations of the secondary sources. ChatGPT is a generative pretrained transformer (GPT) that is trained how to use source material. In ChatGPT’s case that material was prior to September 2021; thus, more recent sources are not included in its responses. It is important to note that the freely available version of ChatGPT (version 3.5) differs from a newer paid version (version 4.0). The newer version is less prone to errors, can access newer materials and can manage more complex prompts. However, because it is free, many users opt for version 3.5.
While other AI programs use algorithms to refine results based on user input (such as Netflix recommendations), ChatGPT ‘learning’ quickly adapts to what you are looking for and may produce results slanted towards a view the program thinks you want, similar to a confirmation bias. At the risk of attributing human traits to a computer program, it seems like it is trying to give users what they want, even to the point of fabricating results. Reference and citation lists have been identified as a particularly troublesome area [1,3]. The implications for these errors are significant.
There are also ethical and legal issues. Biases may be difficult to detect, allowing the newly generated content to extend these biases. Copyright is a particularly troublesome area. First is the issue of ChatGPT potentially accessing and copying protected materials during its training. Second, there is debate about who owns ChatGPT’s output and if it is considered a protected derivative work [8]. Privacy is also a concern as there is the potential to access and report information that is deemed private and requires consent for release [9].
Another issue with biases in AI is the training data set it uses, which may already have biased decisions, reflecting social inequality. Several studies have pointed out examples of human biases finding their way into AI programs, resulting in harmful outcomes; for example, the UK Commission for Racial Equality in 1988 found that a computer program that was developed to match human admissions decisions in a medical school was biased against females and people with non-European names, similarly in Florida a criminal justice algorithm would mislabel African American defendants 2× higher it mislabeled white defendants as high-risk [10].
As the judge in the New York case stated, there is ‘nothing inherently improper about using a reliable artificial intelligence tool for assistance’ [6]. What is improper is not using it responsibly. Until the technology is improved, and errors of this magnitude are eliminated, all users of these programs should question and verify the results. Ethical and legal questions also surround topics such as copyright, plagiarism, and the potential for research fraud.
Like any other new technology, there will be a learning curve for using it at its best. Knowing that it does make mistakes will help simulationists use it in the proper context. The programs themselves are continuing to improve with enhancements to improve accuracy. Over time, using AI will become a common part of the simulation-based education program development process. Until then, use AI as a tool to support simulation-based learning, but recognize its limitations.
DR, RAA, and AM participated in this paper’s conceptualization, planning and design. All authors contributed to the writing and have followed the instructions for authors, and read and approved the manuscript.
The authors have received no funding for this editorial.
Not applicable.
Not applicable.
There are no conflicts of interest to disclose.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.