UHPLC-Q-Orbitrap HRMS-Based Machine Learning Constructs the Integrated Biomarker Profiling of Type 2 Diabetes and Diabetic Heart Disease.
Over 70% of diabetic patients die from cardiovascular disease, in which diabetic heart disease (DHD) is an important cause of death in individuals with type 2 diabetes (T2D). It is hence imperative to explore the simple, rapid, and economical method for diagnosing DHD from T2D.
T2D and DHD patients were recruited, and their serum samples were used for metabolomic analysis to identify differential metabolites. Logistic regression analysis and receiver operating characteristic curve analysis were performed to identify candidate biomarkers. Moreover, four machine learning methods were used to construct the integrated biomarker profiling (IBP) models with the candidate biomarkers. Gini impurity was employed to select characteristic candidate biomarkers.
Eighty-four differential metabolites were identified in the serum of 58 T2D and 62 DHD patients. Logistic regression analysis indicated that 17 differential metabolites were protective factors, whereas 39 were risk factors for DHD. Further, 29 differential metabolites were identified as the candidate biomarkers of DHD after receiver operating characteristic curve analysis. After comparing the predictive performance of the four machine learning models, the IBP was constructed based on the eXtreme Gradient Boosting (XGBoost) with six candidate biomarkers, which were sphingomyelin (d18:0/16:1), deoxycholic acid, hexadecanedioic acid, phosphatidylcholine (20:5/18:3), L-tryptophan, and N-undecanoylglycine from the ranked results of Gini impurity. The accuracy of the IBP for distinguishing T2D and DHD reached 88.89%, with a 100% accuracy in predicting DHD from T2D patients.
The IBP, composed of six metabolites, can effectively predict DHD from T2D, and it is expected to become a screening indicator for early-stage DHD.
T2D and DHD patients were recruited, and their serum samples were used for metabolomic analysis to identify differential metabolites. Logistic regression analysis and receiver operating characteristic curve analysis were performed to identify candidate biomarkers. Moreover, four machine learning methods were used to construct the integrated biomarker profiling (IBP) models with the candidate biomarkers. Gini impurity was employed to select characteristic candidate biomarkers.
Eighty-four differential metabolites were identified in the serum of 58 T2D and 62 DHD patients. Logistic regression analysis indicated that 17 differential metabolites were protective factors, whereas 39 were risk factors for DHD. Further, 29 differential metabolites were identified as the candidate biomarkers of DHD after receiver operating characteristic curve analysis. After comparing the predictive performance of the four machine learning models, the IBP was constructed based on the eXtreme Gradient Boosting (XGBoost) with six candidate biomarkers, which were sphingomyelin (d18:0/16:1), deoxycholic acid, hexadecanedioic acid, phosphatidylcholine (20:5/18:3), L-tryptophan, and N-undecanoylglycine from the ranked results of Gini impurity. The accuracy of the IBP for distinguishing T2D and DHD reached 88.89%, with a 100% accuracy in predicting DHD from T2D patients.
The IBP, composed of six metabolites, can effectively predict DHD from T2D, and it is expected to become a screening indicator for early-stage DHD.