PRIME: an interpretable artificial intelligence model based on liquid biopsy improves prediction of progression risk in non-small cell lung cancer.

Despite the predictive impact of circulating tumor DNA (ctDNA) minimal residual disease (MRD), accurate prediction of failure risk after curative-intent treatments for early-stage or localized non-small cell lung cancer (NSCLC) patients to guide personalized therapy remains challenging. This study aimed to develop and validate an interpretable artificial intelligence-assisted model using global data resources.

Liquid biopsy data, blood-based genomic alterations, clinicopathological features, and survival outcomes of stage I-III NSCLC patients who underwent surgery or definitive chemoradiotherapy were collected from 6 cohorts. PRIME (Progression Risk prediction by Interpretable Machine learning on ctDNA-MRD, Mutations, and clinical-therapeutic features) was trained by 6 machine learning algorithms across 4 cohorts and validated in 2 independent cohorts. Model performance was evaluated by the area under the curve (AUC) and interpreted by SHapley Additive exPlanations (SHAP). Whole-exome sequencing (WES) or whole-genome sequencing (WGS) of tumor tissue from 430 stage II-III NSCLC patients and RNA-sequencing (RNA-seq) data from 1149 subjects, sourced from The Cancer Genome Atlas, were used to validate the prognostic effect of mutations identified in peripheral blood and investigate the underlying mechanisms.

A global dataset encompassing 781 blood samples from 493 patients was analyzed. Clinical stage, pre-treatment ctDNA, post-treatment MRD, blood-based Kelch-like ECH-associated protein 1 (KEAP1), serine/threonine kinase 11 (STK11), and cyclin-dependent kinase inhibitor 2A (CDKN2A) mutations, and treatment modality were significantly associated with the risk of disease progression and were thereby included in the model training. WES/WGS and RNA-seq confirmed the poor prognostic effect of KEAP1, STK11, and CDKN2A mutations, which were characterized by the suppressive tumor microenvironment and attenuated humoral immunity. The neural network (NN) model exhibited optimal prediction of treatment failure risk in the training (AUC = 0.85, 95% CI 0.81-0.89) and validation sets (AUC = 0.82, 95% CI 0.74-0.89). SHAP analysis indicated that MRD (+0.306), treatment modality (+0.128), and pre-treatment ctDNA (+0.043) ranked in the top 3 contributions. NN-PRIME outperformed single liquid biopsy biomarkers and clinical-therapeutic signatures, and demonstrated consistent robustness across different clinical scenarios. High-risk patients identified by NN-PRIME had poorer prognoses but derived significant benefits from adjuvant therapy after surgery.

As an interpretable model integrating readily-accessible and crucial clinical-genomic predictors, PRIME achieves enhanced performance, allowing for early outcome prediction, refined risk stratification, and personalized clinical decision-making.
Cancer
Chronic respiratory disease
Access
Care/Management
Policy
Advocacy
Education

Authors

Wang Wang, Xiang Xiang, Chen Chen, Zhang Zhang, Wang Wang, Liu Liu, Deng Deng, Wang Wang, Gao Gao, Bi Bi
View on Pubmed
Share
Facebook
X (Twitter)
Bluesky
Linkedin
Copy to clipboard