Localizing and Tailoring the Debriefing Assessment for Simulation in Healthcare to Optimize Fit.
Introduction Debriefing in healthcare simulation is helpful in reinforcing learning objectives, closing performance gaps, and improving future practice and patient care. The Debriefing Assessment for Simulation in Healthcare (DASH) is a validated tool. However, localized rater training for the DASH has not been described. We sought to augment DASH anchors with localized notations, localize DASH rater training, assess localized DASH correlation with other debriefing practices/factors, and assess reliability of the localized tool. Methods This study was conducted at SimTiki Simulation Center, John A. Burns School of Medicine, University of Hawai'i at Mānoa. Three simulation experts without DASH training developed a list of debriefing best practices/factors, reviewed the DASH handbook, and transcribed the DASH Rater Long Form version with example behaviors into a rating document. Research assistants recorded best practices/factor data from archived debriefing videos. Simulation experts independently scored debriefings, resolved discrepancies and added localized criteria to the DASH. Rater calibration was completed with an intraclass correlation (ICC) of 0.884. Raters then independently scored 43 debriefings recorded during July-December 2022. DASH scores were compared to observed debriefing best practices/factors. Results The overall DASH behaviors ICC agreement was 0.810 and consistency was 0.825. Behavior scores ranged from 2.45 (SD 0.70) to 4.42 (SD 0.81). The three lowest scoring DASH behaviors were 2A (2.45), 4A (3.41), and 3B (3.44). In behavior 2B regarding realism concerns, there was significant inconsistency in the use of the not applicable (NA) designation. DASH and best practices/factors construct correlation supported convergent validity. Conclusion High interrater reliability followed localized rater training and the addition of localized notations to DASH. Correlation with debriefing practices/factors strengthens DASH validity evidence. Lower DASH behavior scores reported generally suggest that the interpretation of DASH scores is best contextualized in a shared localized DASH construct. A comparison of DASH numerical scores between institutions, with different raters, and different cultures may not reflect absolute debriefing quality.