Clonal analysis of immunodominance and cross-reactivity of the CD4 T cell response to SARS-CoV-2

The identification of CD4+ T cell epitopes is instrumental for the design of subunit vaccines for broad protection against coronaviruses. Here we demonstrate in COVID-19-recovered individuals a robust CD4+ T cell response to naturally processed SARS-CoV-2 spike (S) and nucleoprotein (N), including effector, helper, and memory T cells. By characterizing 2943 S-reactive T cell clones from 34 individuals, we found that 34% of clones and 93% of individuals recognized a conserved immunodominant S346-365 region within the RBD comprising nested HLA-DR- and HLA-DP-restricted epitopes. Using pre- and post-COVID-19 samples and S proteins from endemic coronaviruses, we identify cross-reactive T cells targeting multiple S protein sites. The immunodominant and cross-reactive epitopes identified can inform vaccination strategies to counteract emerging SARS-CoV-2 variants.

T he identification of T cell epitopes in disease-causing organisms is challenging in view of the polymorphism of human leukocyte antigen (HLA) molecules and the variability of rapidly mutating pathogens. In the context of the COVID-19 pandemic, bioinformatic analysis (1) has been used to predict T cell epitopes in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) proteins (2,3) and to produce peptide pools to stimulate peripheral blood mononuclear cells (PBMCs) and enumerate antigenspecific T cells. These studies revealed a robust CD4 + and CD8 + T cell response against SARS-CoV-2 proteins in recovered patients (2)(3)(4)(5)(6) and a level of cross-reactivity with endemic coronaviruses in pre-pandemic samples (7)(8)(9).
A limitation of bioinformatics predictions is the difficulty in identifying immunodomi-nant epitopes, because immunodominance is determined by multiple factors such as antigen processing, T cell repertoire, HLA alleles, and preexisting cross-reactive immunity (10)(11)(12). To identify naturally processed immunodominant CD4 + T cell epitopes, we took the unbiased approach of stimulating T memory (T M ) cells with protein-pulsed antigen-presenting cells (APCs), followed by the isolation of T cell clones to precisely map the epitope recognized (13).
PBMCs from a first cohort of 14 patients who had recovered from mild to severe COVID-19 (table S1) were used to isolate total CD4 + T M cells or T central memory (T CM ), T effector memory (T EM ), and circulating T follicular helper (cT FH ) cells (fig. S1A). The cells were labeled with carboxyfluorescein diacetate succinimidyl ester (CFSE) and stimulated with autologous monocytes in the presence of recombinant SARS-CoV-2 spike (S) protein or nucleoprotein (N). In all individuals, we observed a strong response to both antigens in terms of proliferation and interferon-g (IFN-g) production ( Fig. 1, A and B, and fig. S1, B and C). Proliferating cells were detected at different levels in T CM , T EM , and cT FH cells, consistent with a recent report (14), and over a 1-year period (fig. S1D). By contrast, the CD4 + T M cell response to SARS-CoV-2 proteins in unexposed individuals was low or undetectable ( Fig. 1B and fig. S1C), consistent with the presence of a few cross-reactive T cells primed by endemic coronaviruses (4,5,9).
The clonal composition of SARS-CoV-2reactive T cells and the relationship between different memory subsets was studied in three individuals (P28, P31, and P33) by T cell receptor (TCR) Vb sequencing. The T CM , T EM , and cT FH cell lines comprised, on average, 908, 480, and 697 S-reactive clonotypes and 1452, 623, and 908 N-reactive clonotypes, respectively ( Fig. 1C and fig. S2). Unexpectedly, several of the most expanded clonotypes were shared between two subsets, and even among all three subsets (Fig. 1, C and D), indicating a polyfunctional response consistent with previous studies on intraclonal diversification of antigenprimed CD4 + T cells (15,16).
In view of the interest in the design of a subunit vaccine, we analyzed in depth the CD4 + T cell response to the S protein, in particular to the receptor-binding domain (RBD), which is the main target of neutralizing antibodies (17,18). CD4 + T cells from a larger cohort of 34 COVID-19-recovered individuals (table S1) were stimulated with S protein-pulsed monocytes, and proliferating T cells were cloned by limiting dilution. We obtained 2943 T cell clones and mapped their specificity using three pools of peptides spanning S1 DRBD , RBD, and S2 (Fig.  2, A and B). RBD-specific T cell clones were found in 32 out of 34 donors, accounting for, on average, 20% of the response to the S protein (Fig. 2B). Using a matrix-based approach, we mapped the epitope specificity of 1254 RBDreactive CD4 + T cell clones (Fig. 2C) and found that, in each individual, the clones recognized multiple sites that collectively spanned almost all of the RBD sequence. However, certain regions emerged as immunodominant, such as those spanning residues S346-S385 and S446-S485. A 20-amino acid region (S346-S365) was recognized by 94% of the individuals (30 out of 32) and by 33% of the clones (408 out of 1254) (Fig. 2D). This region is highly conserved among human sarbecoviruses, including the recently emerged variants of concern and zoonotic sarbecoviruses ( Fig. 2E) (19). RBD-and S346-S365specific T cell clones were found in different memory subsets of COVID-19-recovered individuals and were also isolated from individuals after SARS-CoV-2 mRNA vaccination ( fig. S3). Thus, RBD is highly immunogenic in vivo and contains a large number of naturally processed T cell epitopes, including a conserved immunodominant region.
To study the CD4 + T cell response to the immunodominant S346-S365 region, we sequenced TCR Vb chains of 329 specific T cell clones. The 206 clonotypes identified used a broad spectrum of TCR Vb genes and, even in the same individual, carried different CDR3 sequences (Fig. 3, A and B, and table S2). In P31 and P33, certain S346-S365 clonotypes were detected among the top 5% expanded T M cells ex vivo (Fig. 3C). Using blocking antibodies, we determined that most of the T cell clones analyzed (n = 247 from 22 individuals) were HLA-DR restricted, whereas the remaining clones (n = 50 from five individuals) were HLA-DP restricted and one was HLA-DQ restricted (Fig. 3, D and E). Using truncated peptides and T cell clones from individuals with different HLA types (table S3), we defined two HLA-DR-restricted epitopes (VYAWNRK-RIS and RFASVYAWNRKR) and one HLA-DPrestricted epitope (NRKRISNCVAD) (Fig. 3F). Thus, the S346-S365 region comprises at least three nested epitopes recognized in association with different allelic forms of HLA-DR or HLA-DP by T cell clones that use a large set of TCR Vb genes and CDR3 of different sequence and length.
To address the extent of T cell crossreactivity between different S proteins, SARS-CoV-2 S protein-specific T cell lines from P28 and P33 were relabeled with CFSE and stimulated with S proteins from endemic human coronaviruses. In these secondary cultures, a robust proliferation was observed in response to SARS-CoV and HKU1 (Fig. 4A). Unexpectedly, a sizeable fraction of clonotypes in SARS-CoV-2 primary cultures (ranging from 7 to 25%) were found in SARS-CoV and/or HKU1 secondary cultures, consistent with a substantial degree of T cell cross-reactivity ( fig. S4). To corroborate this finding, we isolated from secondary cultures several T cell clones that proliferated in response to two or even three different naturally processed S proteins ( Fig. 4B and table S4).
Cross-reactive T cells may derive from preexisting memory T cells or from the priming of naïve T cells. We therefore analyzed a COVID-19-recovered individual from whom we had previously cryopreserved PBMCs. A robust CD4 + T M cell proliferation in the pre-COVID-19 sample was detected against NL63 and 229E S proteins, whereas the response to HKU1 and OC43 was limited and the response to SARS-CoV and SARS-CoV-2 undetectable (Fig. 4C). Conversely, in the post-COVID-19 sample, strong T cell proliferation was observed not only in response to SARS-CoV-2, but also in response to all other alpha and beta coronavirus S proteins (Fig. 4, C and D), and shared clonotypes were detected between SARS-CoV-2 and endemic coronavirus S protein-stimulated cultures ( fig. S5A). Furthermore, T cell clones isolated from cultures stimulated by SARS-CoV, OC43, or NL63 proliferated in response to the SARS-CoV-2 S peptide pool, and their specificity was mapped primarily to the S2 region (Fig. 4, E and F), consistent with its high degree of sequence conservation (20)(21)(22). T cell clones that fully cross-reacted with all S proteins were mapped to the highly conserved fusion peptide (Fig. 4G).
To determine whether S-reactive T cells in the post-COVID-19 sample could be detected in pre-pandemic samples, we performed clonotypic analysis of total T M cells on the post-COVID-19 sample and on samples collected in 2014 and 2017. Most of the SARS-CoV-2specific clonotypes identified above were found only in the post-COVID-19 sample, consistent with priming of naïve T cells ( fig. S5B). By contrast, clonotypes specific to endemic coronaviruses were found at a comparable number at all time points. Some T cell clonotypes against the highly conserved fusion peptide could be tracked back to the 2014 sample and were found to be expanded in the post-COVID-19 sample (fig. S5C). These findings demonstrate that preexisting cross-reactive T M cells are recalled and expanded upon SARS-CoV-2 infection.
The robust CD4 + T cell response to the RBD and the identification of the S346-S365 immunodominant region conserved in the emerging SARS-CoV-2 variants of concern provide the rationale for the development of a subunit vaccine based on the RBD because it is the target of most neutralizing antibodies (17,18). These findings were not anticipated Low  in previous studies based on bioinformatics predictions (2, 3) and short-term peptide stimulation of PBMCs, highlighting the value of combining T cell stimulation with protein antigens with cloning and TCR sequencing for the analysis of antigen-specific T cell repertoires.
The immunodominance of RBD S346-S365 at the individual level and at the population level may be due to the presence of three nested T cell epitopes presented by HLA-DR and HLA-DP and to the relative abundance of naturally processed peptides, as recently reported in an immunopeptidomics study (23). The S346-S365 region is also a contact site for the broadly reactive neutralizing antibody S309 (24), providing a good example of convergence of B and T cells around a conserved epitope.  Our study also provides evidence for the recall of preexisting cross-reactive T M cells upon SARS-CoV-2 infection. However, this phenomenon, reminiscent of the "original antigenic sin" (25), does not prevent a robust and persistent primary response to new epitopes of SARS-CoV-2 that is characterized by extensive intraclonal diversification into T EM , cT FH , and T CM cells, which represent inflammatory, helper, and long-lived T M cells, respectively (26,27). The availability of a large number of cross-reactive T cell clones is not only instrumental for defining target sites in relevant pathogens but also for understanding whether cross-reactivity is due to epitope structural similarities or to TCR-binding degeneracy (11,28).
The possibility of leveraging a robust, crossreactive T helper cell function against conserved sites will be instrumental in driving neutralizing antibody responses to adaptive vaccines that incorporate escape mutations found in emerging SARS-CoV-2 variants. The authors declare no competing interests. Data and materials availability: TCR Vb sequences have been deposited in the ImmuneACCESS database (29). All other data are available in the main text or the supplementary materials. This work is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/. This license does not apply to figures/photos/artwork or other content included in the article that is credited to a third party; obtain authorization from the rights holder before using such material.

SUPPLEMENTARY MATERIALS
science.sciencemag.org/content/372/6548/1336/suppl/DC1 Materials and Methods Tables S1 to S5 Figs. S1 to S5 References (30, 31) MDAR Reproducibility Checklist View/request a protocol for this paper from Bio-protocol. with pools of peptides spanning the S1 DRBD , RBD, and S2 regions of the SARS-CoV-2 S protein. Histograms show the percentage of clones specific for each region. The total number of clones tested is indicated at the top.
(G) Characterization of representative cross-reactive T cell clones isolated from P34 post-COVID-19 sample. Left panels report the proliferative response (day 3 cpm) of T cell clones stimulated with titrated doses of recombinant S proteins in the presence of autologous monocytes. The peptides recognized are indicated on the right panels. Shown are sequence alignments of the recognized SARS-CoV-2 epitopes (S816-S830 and S981-S1000) with homologous sequences of endemic alpha and beta coronaviruses. Dots indicate amino acid residues identical to the SARS-CoV-2 reference strain.