Balancing diagnostic procedure alternatives can already feel like a delicate tightrope for practitioners. The reluctance of many professionals to misuse pejorative diagnostic labels in behavioral health, along with the complexity of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5), often results in practitioners limiting themselves to a small repertoire of Current Procedural Terminology (CPT) codes. This selective approach, while offering familiarity, might inadvertently jeopardize the breadth and depth of diagnoses. However, a new performer has entered this high-wire act—the emergence of Artificial Intelligence (AI). The model most conducive to looking at behavioral patterns to determine possible diagnostic categories is available through ChatGPT. This tool can craft text that mirrors human conception, seamlessly integrating factual contexts to yield a range of possible DSM-5 diagnoses. The spotlight now shines on the pivotal query: In this juncture of time, does harnessing a ChatGPT diagnosis tip the scales of legality or ethics? Addressing ethical challenges of AI implementation in healthcare, several researchers are calling for essential ethical guidelines to address the potential misuse of ChatGPT diagnosis in behavioral health.
Within the following discussion, I will summarize three recent scholarly articles delving into the application of ChatGPT diagnostics. Moreover, I will give you a sample of a behavioral ChatGPT diagnosis response extracted from ChatGPT Pro in response to a short behavioral symptom pattern. Then, I invite you, the reader, to share your thoughts about the legalities and ethics of using the response I obtained.
Recent Studies about ChatGPT Diagnosis
A medical study led by Mass General Brigham has examined ChatGPT’s clinical decision-making capabilities, revealing a 72% accuracy in medical specialties and a 77% accuracy in final diagnoses (2023). Published in the Journal of Medical Internet Research last week, the research assessed ChatGPT’s support across the entire patient care process, from initial evaluation to diagnosis and management. Corresponding author Marc Succi, MD, likened the accuracy level of ChatGPT to that of a recent medical school graduate. The study simulated patient scenarios, with ChatGPT suggesting diagnoses based on initial data and making care decisions. The chatbot performed best in delivering final diagnoses (77% accuracy) but struggled with differential diagnoses (60%) and clinical management decisions (68%). This suggests the potential for AI augmentation in diagnostics, but the authors stressed that further benchmarking and regulatory guidance is essential.
In a similar study published two weeks ago (2023), the capabilities of GPT-4, a Generative Pre-trained Transformer 4 program, were examined in improving diagnostic accuracy for older patients with delayed diagnoses. The small study, led by researchers at the Division of Geriatrics, Queen Mary Hospital, assessed GPT-4’s performance in suggesting likely diagnoses and differential diagnoses based on patient medical histories. The results indicated GPT-4’s accuracy of 66.7% in primary diagnoses and 83.3% when including differential diagnoses. This AI tool demonstrated its potential to offer diagnoses that clinicians might have overlooked while highlighting the need for comprehensive clinical information and human oversight.
Turning to behavioral health, significant progress has been made. For instance, a 2022 behavioral study highlighted individual clinicians’ experience limitations and the need