Automated Identification of Stroke Thrombolysis Contraindications from Synthetic Clinical Notes - a Proof-of-Concept Study.

Journal: Cerebrovascular Diseases Extra
Published:
Abstract

Background: Timely thrombolytic therapy improves outcomes in acute ischemic stroke. Manual chart review to screen for thrombolysis contraindications may be time-consuming and prone to errors. We developed and tested a large language model (LLM)-based tool to identify thrombolysis contraindications from clinical notes using synthetic data in a proof-of-concept study.

Methods: We generated 150 synthetic clinical notes containing randomly assigned thrombolysis contraindications using LLMs. We then used Llama 3.1 405B with a custom prompt to generate a list of thrombolysis contraindications from each note. Performance was evaluated using sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and F1 score.

Results: A total of 150 synthetic notes were generated using five different models: ChatGPT-4o, Llama 3.1 405B, Llama 3.1 70B, ChatGPT-4o mini, and Gemini 1.5 Flash. On average, each note contained 241.6 words (SD 110.7; range 80-549) and included 1.5 contraindications (SD 1.1; range 0-5). Our tool achieved a sensitivity of 90.9% (95% CI: 86.3%-94.3%), specificity of 99.2% (95% CI: 98.8%-99.5%), PPV of 87.7% (95% CI: 82.7%-91.7%), NPV of 99.4% (95% CI: 99.1%-99.6%), accuracy of 98.7% (95% CI: 98.2%-99.0%), and an F1 score of 0.892. Among the false positives, 24 (86%) were due to the inclusion of irrelevant contraindications, and 4 (14%) resulted from repetitive information. No hallucinations were observed.

Conclusions: Our LLM-based tool may identify stroke thrombolysis contraindications from synthetic clinical notes with high sensitivity and PPV. Future studies will validate its performance using real EMR data and integrate it into acute stroke workflows to facilitate faster and safer thrombolysis decision-making.

Authors
Relevant Conditions

Stroke