Multiple large language models (LLMs) have been released since 2022, including OpenAI’s GPT-3.5 and GPT-4. The latest model, GPT-4o, introduced on May 13, 2024, significantly improves GPT-4. Previous studies have shown the potential of LLMs as educational tools in medical and dental exams. This study evaluates the accuracy of GPT-4 and GPT-4o responses for the Japanese National Dental Examination (JNDE) to assess their potential as educational tools for dental education.
Materials and methodsWe obtained the dataset of the 117th JNDE, administered in January 2024, consisting of 360 questions. After excluding questions with images and inappropriate ones, 202 questions were selected. GPT-4 and GPT-4o were used to generate responses. Standardized prompts ensured consistent input. Data analysis used Qlik Sense® and GraphPad Prism, employing Fisher’s exact test.
ResultsGPT-4o showed a significantly higher correct response rate (73.8%) than GPT-4 (63.3%). In the compulsory section, GPT-4o achieved 88.6% accuracy, significantly higher than GPT-4′s 74.3%. Though not statistically significant, the general section saw an improvement with GPT-4o (66.4%) over GPT-4 (58.0%).
ConclusionGPT-4o significantly outperformed GPT-4 in accuracy for JNDE questions, suggesting its improved potential as an educational tool in dental education. Further studies are needed to evaluate GPT-4o’s capabilities with visual materials and in diverse question sets to fully ascertain its utility in educational settings.
KeywordsGPT-4o
GPT-4
Japanese National Dental Examination
Education tool
© 2024 THE AUTHORS. Published by Elsevier B.V. on behalf of King Saud University.
Comments (0)