UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Contrastive learning of T cell receptor representations

Nagano, Yuta; Pyo, Andrew GT; Milighetti, Martina; Henderson, James; Shawe-Taylor, John; Chain, Benny; Tiffeau-Mayer, Andreas; (2025) Contrastive learning of T cell receptor representations. Cell Systems , 16 (1) , Article 101165. 10.1016/j.cels.2024.12.006. Green open access

[thumbnail of Nagano2025.pdf]
Preview
Text
Nagano2025.pdf - Published Version

Download (3MB) | Preview

Abstract

Computational prediction of the interaction of T cell receptors (TCRs) and their ligands is a grand challenge in immunology. Despite advances in high-throughput assays, specificity-labeled TCR data remain sparse. In other domains, the pre-training of language models on unlabeled data has been successfully used to address data bottlenecks. However, it is unclear how to best pre-train protein language models for TCR specificity prediction. Here, we introduce a TCR language model called SCEPTR (simple contrastive embedding of the primary sequence of T cell receptors), which is capable of data-efficient transfer learning. Through our model, we introduce a pre-training strategy combining autocontrastive learning and masked-language modeling, which enables SCEPTR to achieve its state-of-the-art performance. In contrast, existing protein language models and a variant of SCEPTR pre-trained without autocontrastive learning are outperformed by sequence alignment-based methods. We anticipate that contrastive learning will be a useful paradigm to decode the rules of TCR specificity. A record of this paper’s transparent peer review process is included in the supplemental information.

Type: Article
Title: Contrastive learning of T cell receptor representations
Location: United States
Open access status: An open access version is available from UCL Discovery
DOI: 10.1016/j.cels.2024.12.006
Publisher version: https://doi.org/10.1016/j.cels.2024.12.006
Language: English
Additional information: Copyright © 2024 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Keywords: Protein language models; contrastive learning; TCR repertoire; T cell specificity; TCR; T cell receptor; representation learning
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences > Div of Infection and Immunity
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10205427
Downloads since deposit
Loading...
3Downloads
Download activity - last month
Loading...
Download activity - last 12 months
Loading...
Downloads by country - last 12 months
Loading...

Archive Staff Only

View Item View Item