Evaluation of Chatbots in the Emergency Management of Avulsion Injuries.

Journal: Dental Traumatology : Official Publication Of International Association For Dental Traumatology

Published: August 16, 2024

Abstract

Background: This study assessed the accuracy and consistency of responses provided by six Artificial Intelligence (AI) applications, ChatGPT version 3.5 (OpenAI), ChatGPT version 4 (OpenAI), ChatGPT version 4.0 (OpenAI), Perplexity (Perplexity.AI), Gemini (Google), and Copilot (Bing), to questions related to emergency management of avulsed teeth.

Methods: Two pediatric dentists developed 18 true or false questions regarding dental avulsion and asked public chatbots for 3 days. The responses were recorded and compared with the correct answers. The SPSS program was used to calculate the obtained accuracies and their consistency.

Results: ChatGPT 4.0 achieved the highest accuracy rate of 95.6% over the entire time frame, while Perplexity (Perplexity.AI) had the lowest accuracy rate of 67.2%. ChatGPT version 4.0 (OpenAI) was the only AI that achieved perfect agreement with real answers, except at noon on day 1. ChatGPT version 3.5 (OpenAI) was the AI that showed the weakest agreement (6 times).

Conclusions: With the exception of ChatGPT's paid version, 4.0, AI chatbots do not seem ready for use as the main resource in managing avulsed teeth during emergencies. It might prove beneficial to incorporate the International Association of Dental Traumatology (IADT) guidelines in chatbot databases, enhancing their accuracy and consistency.

Authors

Şeyma Mustuloğlu, Büşra Deniz

Evaluation of Chatbots in the Emergency Management of Avulsion Injuries.

Similar Publications