ChatGPT-4
its performance in a final exam of the career of a medical specialist in ophthalmology at the University of Buenos Aires
DOI:
https://doi.org/10.70313/2718.7446.v17.n01.286Keywords:
artificial intelligence, ChatGPT-4, medical education, ophthalmologyAbstract
Objectives: To evaluate the performance of the GPTChat-4 in a final exam of the University Medical Specialist in Ophthalmology degree at the University of Buenos Aires and compare it with the performance of the students and the ChatGPT-3.5.
Materials and methods: Observational, retrospective and analytical study. The answers of a group of students were evaluated in a multiple choice exam of 50 questions with 4 answer options taken on September 8th, 2023. They were compared with the performance on the same exam of ChatGPT versions 3.5 and 4.
Results: Students (n = 7) correctly answered, on median, 39 questions (39/50), an accuracy of 78%, with a range of correct answers between 33 and 45. The average time to complete the exam was 75 minutes. ChatGPT-3.5 correctly answered 31 questions (31/50), an accuracy of 62%. Chat GPT-4 correctly answered 40 questions (40/50) an accuracy of 80% completing the exam in 73,49 seconds.
Conclusions: ChatGPT-4 achieved higher performance than the average student using 61 times less time. ChatGPT-4 achieved higher accuracy than ChatGPT-3.5. The grade obtained by the two versions of ChatGPT allows to pass the final exam of the University Medical Specialists in Ophthalmology Career at the University of Buenos Aires.
References
Ting DSJ, Tan TF, Ting DSW. ChatGPT in opthalmology: the dawn of a new era ? Eye (Lond) 2024; 38: 4-7.
Raimondi R; Tzoumas N; North East Trainee Research in Ophthalmology Network (NETRiON) et al. Comparative analysis of large language models in the Royal College of Ophthalmologists fellowship exams. Eye (Lond) 2023; 37: 3530-3533.
Kung TH, Cheatham M, Medenilla A et al. Performance of Chat GPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digit Health 2023; 2: e0000198.
Lee P, Bubeck S, Petro J. Benefits, limits, and risks of GPT-4 as an AI chatbot for medicine. N Engl J Med 2023; 388: 1233-1239.
Moshirfar M, Altaf AW, Stoakes IM et al. Artificial intelligence in opthalmology: a comparative analysis of GPT-3.5, GPT-4, and human expertise in answering StatPearls questions. Cureus 2023; 15: e40822.
Takagi S, Watari T, Erabi A, Sakaguchi K. Performance of GPT-3.5 and GPT-4 on the Japanese medical licensing examination: comparison study. JMIR Med Educ 2023; 9: e48002.
Wang H, Wu W, Dou Z et al. Performance and exploration of ChatGPT in medical examination, records and education in Chinese: pave the way for medical AI. Int J Med Inform 2023; 177: 105173.
De Vito E. Inteligencia artificial y chat GPT. ¿Usted leería a un autor artificial? Medicina (B Aires) 2023; 83: 329-332.
Lüthy IA. Inteligencia artificial y aprendizaje de máquina en diagnóstico y tratamiento del cáncer. Medicina (B Aires) 2022; 82: 798-800.
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Consejo Argentino de Oftalmología
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Con esta licencia no se permite un uso comercial de la obra original, ni la generación de obras derivadas. Las licencias Creative Commons permiten a los autores compartir y liberar sus obras en forma legal y segura.