Real-Time Snoring Detection Using Deep Learning: A Home-Based Smartphone Approach for Sleep Monitoring.
Despite the prevalence of sleep-related disorders, few studies have developed deep learning models to predict snoring using home-recorded smartphone audio. This study proposes a real-time snoring detection method utilizing a Vision Transformer-based deep learning model and smartphone recordings. Participants' sleep-breathing sounds were recorded using smartphones, with concurrent Level I or II polysomnography (PSG) conducted in home or hospital settings. A total of 200 minutes of smartphone audio per participant, corresponding to 400 30-second sleep stage epochs on PSG, were sampled. Each epoch was annotated independently by two trained labelers, with snoring labeled only when both agreed. Model performance was evaluated by epoch-by-epoch prediction accuracy and correlation between observed and predicted snoring ratios. The study included 214 participants (85,600 epochs). Hospital audio data from 105 participants (42,000 epochs) were used for training, while home audio data from 109 participants were split into 54 participants (21,600 epochs) for training and 55 participants (22,000 epochs) for testing. On the test dataset, the model demonstrated a sensitivity of 89.8% and a specificity of 91.3%. Correlation analysis showed strong agreement between observed and predicted snoring ratios (r = 0.97, 95% CI: 0.95-0.99). This study demonstrates the feasibility of using deep learning for real-time snoring detection from home-recorded smartphone audio. With high accuracy and scalability, the approach offers a practical and accessible tool for monitoring sleep-related disorders, paving the way for home-based sleep health management solutions.