A web-based tool for predicting gastric ulcers in Chinese elderly adults based on machine learning algorithms and noninvasive predictors: A national cross-sectional and cohort study.
As the Chinese population continues to age, the prevalence of gastric ulcers, a common nutrition and diet-related disorder, is rising among the elderly. Gastric ulcers pose a significant public health challenge in China, yet there is limited research to predict gastric ulcers accurately. Our study aims to employ machine learning algorithms to predict the occurrence of gastric ulcers and develop an online tool to assess the risk of gastric ulcers for elderly individuals, both currently and in the future, while identifying important predictors. We used baseline data from the Chinese Longitudinal Healthy Longevity Survey in 2011 and 2014, with a follow-up endpoint of 2018. We employed nine machine learning algorithms to construct predictive models for gastric ulcers over the next seven years (2011-2018, with 1482 samples) and the next three years (2014-2018, with 2659 samples). Additionally, we utilized cross-sectional data from 2018 (with 13,775 samples) to construct a predictive model for current gastric ulcers. Noninvasive predictors such as demographic, behavioral, nutritional, and physical examination factors were utilized to predict the current and future occurrence of gastric ulcers. In our study, Support Vector Machine (SVM), Random Forest (RF), and Light Gradient Boosting Machine (LGBM) achieved an accuracy of 0.97 for predicting gastric ulcers over seven years; Logistic Regression, Adaptive Boosting, SVM, RF, Gradient Boosting Machine, LGBM, and K-Nearest Neighbors reached 0.98 for three-year predictions; and SVM, Extreme Gradient Boosting, RF, and LGBM attained 0.95 for current gastric ulcer prediction. We developed MyGutRisk, built on optimal machine learning models, relatively accurately predicts gastric ulcer risk in elderly adults using noninvasive factors like diet and lifestyle. It supports self-assessment via a public link and clinical screening in community health settings to guide preventive measures. However, as a prototype, it requires further validation to ensure accuracy and generalizability across diverse populations and real-world applications.