Assessment Of Chatgpt's Diagnostic Skills In Ophthalmology

Published 2024 - 42nd Congress of the ESCRS

Reference: PO1077 | Type: Free paper | DOI: 10.82333/17f2-nr25

Authors: Asaf Shemer* ¹ , Michal Cohen ¹ , Aya Altarescu ¹ , Maya Atar ¹ , Idan Hecht ¹ , Biana Dubinsky-Pertzov ¹ , Nadav Shoshany ¹ , Sigal Zmujack ¹ , Lior Or ¹ , Adi Einan-Lifshitz ¹ , Eran Pras ¹

¹Department of Ophthalmology,Shamir Medical Center,Zriffin,Israel;Tel Aviv University,Tel Aviv,Israel

Purpose

This research aims to evaluate the accuracy of ChatGPT's diagnoses within the ophthalmology field.

Setting

This is a retrospective cohort study conducted in one academic tertiary medical center.

Methods

We examined the records of patients who were treated in the ophthalmology department between June 2022 and January 2023. For each patient, two clinical scenarios were developed The first case is according to the medical history alone (Hx). The second case includes an addition of the clinical examination (Hx and Ex). For each case, we asked for the three most likely diagnoses from ChatGPT, residents, and attendings. Then, we compared the accuracy rates (at least one correct diagnosis) of all groups. Additionally, we evaluated the total duration for completing the assignment between the groups.

Results

ChatGPT, residents, and attendings, evaluated 126 cases from 63 patients (history only or history and exam findings for each patient). ChatGPT achieved a significantly lower accurate diagnosis rate (54%) in the Hx, as compared to the residents (75%; p<0.01) and attendings (71%; p<0.01). After adding the clinical examination findings, the diagnosis rate of ChatGPT was 68%, whereas for the residents and the attendings, it increased to 94% (p<0.01) and 86% (p<0.01), respectively. ChatGPT was 4 to 5 times faster than the attendings and residents.

Conclusions

Compared to residents and attendings, ChatGPT demonstrated lower diagnostic accuracy in ophthalmology based solely on patient history or when factoring in findings from clinical examinations. Nonetheless, ChatGPT was able to complete the diagnostic task more quickly than the physicians.