Interpretable machine learning models for predicting cognitive impairment using NHANES neuropsychological tests: nutritional and sociodemographic associations.
Early identification of individuals at risk for cognitive impairment is essential for timely intervention and public health planning. While sociodemographic and clinical predictors are well recognized, the role of nutrition and its interactions in cognitive health remains less explored.
Using data from the 2011-2014 National Health and Nutrition Examination Survey (NHANES, n = 2,208), we developed ensemble machine learning models (LightGBM, XGBoost, Random Forest) to predict cognitive impairment across three neuropsychological assessments (CERAD-WL, DSST, AFT). SHapley Additive exPlanations (SHAP) were applied to quantify and interpret the contribution of demographic, clinical, and nutritional predictors, as well as their interactions. To validate the nutrient interactions identified by our models, we conducted exploratory in vitro experiments assessing oxidative stress and neuroprotective pathways in SH-SY5Y neuronal cells.
Ensemble models demonstrated excellent predictive performance, consistently outperforming traditional classifiers. Key predictors included education, age, socioeconomic status, and chronic disease conditions. Among nutritional factors, vitamin B2 emerged as consistently associated with lower predicted cognitive impairment risk across all three models, with notable interactions observed with copper and vitamin E. Exploratory in vitro experiments supported these associations, showing reduced oxidative stress and increased expression of neuroprotective genes (SIRT1, BDNF) under vitamin B2 treatment, particularly when combined with copper or vitamin E.
Interpretable machine learning models integrating cognitive tests with demographic, clinical, and nutritional variables can accurately predict cognitive impairment. Nutritional predictors, particularly vitamin B2 and its interactions, may contribute to model performance and biological plausibility, suggesting potential avenues for stratified monitoring strategies.
Using data from the 2011-2014 National Health and Nutrition Examination Survey (NHANES, n = 2,208), we developed ensemble machine learning models (LightGBM, XGBoost, Random Forest) to predict cognitive impairment across three neuropsychological assessments (CERAD-WL, DSST, AFT). SHapley Additive exPlanations (SHAP) were applied to quantify and interpret the contribution of demographic, clinical, and nutritional predictors, as well as their interactions. To validate the nutrient interactions identified by our models, we conducted exploratory in vitro experiments assessing oxidative stress and neuroprotective pathways in SH-SY5Y neuronal cells.
Ensemble models demonstrated excellent predictive performance, consistently outperforming traditional classifiers. Key predictors included education, age, socioeconomic status, and chronic disease conditions. Among nutritional factors, vitamin B2 emerged as consistently associated with lower predicted cognitive impairment risk across all three models, with notable interactions observed with copper and vitamin E. Exploratory in vitro experiments supported these associations, showing reduced oxidative stress and increased expression of neuroprotective genes (SIRT1, BDNF) under vitamin B2 treatment, particularly when combined with copper or vitamin E.
Interpretable machine learning models integrating cognitive tests with demographic, clinical, and nutritional variables can accurately predict cognitive impairment. Nutritional predictors, particularly vitamin B2 and its interactions, may contribute to model performance and biological plausibility, suggesting potential avenues for stratified monitoring strategies.