Artificial intelligence chatbots versus traditional medical resources for patient education on "Labor Epidurals": an evaluation of accuracy, emotional tone, and readability.
Background: Labor epidural analgesia is a widely used method for pain relief in childbirth, yet information accessibility for expectant mothers remains a challenge. Artificial intelligence (AI) chatbots like Chat Generative Pre-Trained Transformer (ChatGPT) and Google Gemini offer potential solutions for improving patient education. This study evaluates the accuracy, readability, and emotional tone of AI chatbot responses compared to the American Society of Anesthesiologists (ASA) online materials on labor epidurals.
Methods: Eight common questions about labor epidurals were posed to ChatGPT and Gemini. Seven obstetric anaesthesiologists evaluated the generated responses for accuracy and completeness on a 1-10 Likert scale, comparing them with ASA-sourced content. Statistical analysis (one-way ANOVA, Tukey HSD), sentiment analysis and readability metrics (Flesch Reading ease) were used to assess differences.
Results: ASA materials scored highest for accuracy (8.80 ± 0.40) and readability, followed by Gemini and ChatGPT. Completeness scores showed ASA and Gemini performing significantly better than ChatGPT (P <0.001). ASA materials were the most accessible, while Gemini content was more complex. Sentiment analysis indicated a neutral tone for ASA and Gemini, with ChatGPT displaying a less consistent tone.
Conclusions: AI chatbots exhibit promise in patient education for labor epidurals but require improvements in readability and tone consistency to enhance engagement. Further refinement of AI chatbots may support more accessible, patient-centred healthcare information.