PepTCR-Net: prediction of multi-class antigen peptides by T-cell receptor sequences with deep learning.

Predicting T-cell receptor (TCR) recognizing antigen peptides is crucial for understanding the immune system and developing new treatments for cancer, infectious and autoimmune diseases. As experimental methods for identifying TCR-antigen recognition are expensive and time-consuming, machine-learning approaches are increasingly used. However, existing computational tools often struggle with generalization due to limited data and challenges in acquiring true non-recognition pairs and rarely integrate multiple biological features into unified frameworks. To address these challenges, we propose a two-step framework for predicting TCR-antigen recognition. The first step focuses on feature engineering: neural network-based embeddings of letter-based TCR and peptide sequences inspired by language models, and categorical encoding of Human Leukocyte Antigen types and Variable/Joining genes. In the second step, we built a prediction model to assess the likelihood of TRC-antigen recognition by a Bayesian Feedforward Neural Network. We trained and validated the framework using large public databases. Our results demonstrate that our advanced feature engineering delivers strong predictive performance both internally and externally. We applied the framework to a real-world case for predicting whether specific TCRs can recognize SARS-CoV-2 epitope peptides, demonstrating that our framework can function as a de novo TCR-antigen prediction tool applicable to infectious diseases.
Chronic respiratory disease
Access
Advocacy

Authors

Le Le, Ung Ung, Yang Yang, Huang Huang, He He, Bruno Bruno, Oh Oh, Keenan Keenan, Zhang Zhang
View on Pubmed
Share
Facebook
X (Twitter)
Bluesky
Linkedin
Copy to clipboard