Selective and cross-reactive SARS-CoV-2 T cell epitopes in unexposed humans

Preexisting immune response to SARS-CoV-2 Robust T cell responses to the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus occur in most individuals with coronavirus disease 2019 (COVID-19). Several studies have reported that some people who have not been exposed to SARS-CoV-2 have preexisting reactivity to SARS-CoV-2 sequences. The immunological mechanisms underlying this preexisting reactivity are not clear, but previous exposure to widely circulating common cold coronaviruses might be involved. Mateus et al. found that the preexisting reactivity against SARS-CoV-2 comes from memory T cells and that cross-reactive T cells can specifically recognize a SARS-CoV-2 epitope as well as the homologous epitope from a common cold coronavirus. These findings underline the importance of determining the impacts of preexisting immune memory in COVID-19 disease severity. Science, this issue p. 89

T he emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in late 2019 and its subsequent global spread has led to millions of infections and substantial morbidity and mortality (1). Coronavirus disease 2019 (COVID- 19), the clinical disease caused by SARS-CoV-2 infection, can range from mild, self-limiting disease to acute respiratory distress syndrome and death (2). The mechanisms underlying the spectrum of COVID-19 disease severity states and the nature of protective immunity against COVID-19 remain unclear.
Studies investigating the human immune response against SARS-CoV-2 have begun to characterize SARS-CoV-2 antigen-specific T cell responses (3)(4)(5)(6)(7)(8), and multiple studies have described marked activation of T cell subsets in acute COVID-19 patients (9)(10)(11)(12)(13). Unexpectedly, antigen-specific T cell studies performed with five different cohorts reported that 20 to 50% of people who had not been exposed to SARS-CoV-2 had significant T cell reactivity directed against peptides corresponding to SARS-CoV-2 sequences (3)(4)(5)(6)(7). The studies were from geographically diverse cohorts (the United States, the Netherlands, Germany, Singapore, and the United Kingdom), and the general pattern observed was that the T cell reactivity found in unexposed individuals was predominantly mediated by CD4 + T cells. It was speculated that this phenomenon might be due to preexisting memory responses against human "common cold" coronaviruses (HCoVs) such as HCoV-OC43, HCoV-HKU1, HCoV-NL63, and HCoV-229E. These HCoVs share partial sequence homology with SARS-CoV-2, are widely circulating in the general population, and are typically responsible for mild respiratory symptoms (14)(15)(16). However, the hypothesis of cross-reactive immunity between SARS-CoV-2 and common cold HCoVs still awaits experimental trials. This potential preexisting crossreactive T cell immunity to SARS-CoV-2 has broad implications because it could explain aspects of differential COVID-19 clinical outcomes, influence epidemiological models of herd immunity (17,18), or affect the performance of COVID-19 candidate vaccines.

Epitope repertoire in SARS-CoV-2-unexposed individuals
To define the repertoire of CD4 + T cells recognizing SARS-CoV-2 epitopes in previously unexposed individuals, we used in vitro stimulation of peripheral blood mononuclear cells (PBMCs) for 2 weeks with pools of 15-mer peptides. This method is known to be robust for detecting low-frequency T cell responses to allergens and bacterial or viral antigens (19,20), including naive T cells (21). For screening SARS-CoV-2 epitopes, we used PBMC samples from unexposed subjects collected between March 2015 and March 2018, well before the global circulation of SARS-CoV-2 occurred. The unexposed subjects were confirmed to be seronegative for SARS-CoV-2 ( fig. S1A).
SARS-CoV-2-reactive T cells were expanded, with one pool of peptides spanning the entire sequence of the spike protein (CD4-S) and the other a nonspike "megapool" (CD4-R) of predicted epitopes from the nonspike regions (i.e., "remainder") of the viral genome (4). In total, 474 15-mer SARS-CoV-2 peptides were screened. After 14 days of stimulation, T cell reactivity against intermediate "mesopools," each encompassing~10 peptides, was assayed using a FluoroSPOT assay (e.g., 22 CD4-R mesopools; fig. S2A). Positive mesopools were further deconvoluted to identify specific individual SARS-CoV-2 epitopes. Representative results from one donor show the deconvolution of mesopools P6 and P18 to identify seven different SARS-CoV-2 epitopes ( fig. S2B). Intracellular cytokine-staining assays specific for interferon g (IFN-g) determined whether antigen-specific T cells responding to the SARS-CoV-2 mesopools were CD4 + or CD8 + T cells ( fig. S2C). Results from the 44 donors/CD4-R mesopool and 40 donors/CD4-S mesopool combinations yielding a positive response are shown in fig. S2, D and E, respectively. In 82/88 cases (93.2%), the cells responding to SARS-CoV-2 mesopool stimulation were clearly CD4 + T cells, as judged by the ratio of CD4/CD8-responding cells; in four cases (4.5%), the responding cells were CD8 + T cells; and in two cases (2.3%), the responses were mediated by both CD4 + and CD8 + T cells. The fact that CD8 + T cells were rarely detected was not surprising because the peptides used in CD4-R encompassed predicted class II epitopes and the CD4-S is composed of 15-mer peptides (9-to 10-mer peptides are optimal for CD8 + T cells). Furthermore, the 2-week restimulation protocol was originally designed to expand CD4 + T cells (20). Overall, these results indicated that the peptidescreening strategy used mapped SARS-CoV-2 epitopes recognized by CD4 + T cells in unexposed individuals.
A total of 142 SARS-CoV-2 epitopes were identified, 66 from the spike protein (CD4-S) and 76 from the remainder of the genome (CD4-R) (table S1). For each combination of epitope and responding donor, potential human leukocyte antigen (HLA) restrictions were inferred on the basis of the predicted HLA-binding capacity of the particular epitope for the specific HLA alleles present in the responding donor (22). Each donor recognized an average of 11.4 epitopes (range 1 to 33, median 6.5; fig.  S3A). Forty of the 142 epitopes were recognized by two or more donors ( fig. S3B), accounting for 55% of the total response ( fig. S3C). These 142 mapped SARS-CoV-2 epitopes may prove useful in future studies as reagents for tracking CD4 + T cells in SARS-CoV-2-infected individuals and in COVID-19 vaccine trials.

Epitope distribution by ORF of origin
Although a broad range of different SARS-CoV-2 antigens were recognized, several of the epitopes yielding the most frequent (i.e., recognized in multiple donors) or most vigorous [i.e., the most spot-forming cells (SFCs)/10 6 cells] responses were derived from the SARS-CoV-2 spike antigen (table S1). We therefore assessed the overall distribution of the 142 T cell epitopes mapped among all SARS-CoV-2 proteins compared with the relative size of each SARS-CoV-2 antigen (Fig. 1, A and B).
Fifty-four percent of the total positive response was associated with spike-derived epitopes [ Fig. 1A; 11% for receptor-binding domain (RBD), and 44% for the non-RBD portion of spike]. Of relevance for COVID-19 vaccine development, only 20% of the spike responses were derived from the RBD region ( Fig. 1A; comparing 11 versus 44%, as described above), and the RBD region accounted for only 11% of the overall CD4 + T cell reactivity (Fig. 1A). Mapped epitopes were fairly evenly distributed across the SARS-CoV-2 genome in pro-portion to the size of each protein ( Fig. 1B; P = 0.038, r = 0.42). In addition to the strong responses directed to spike, responses were also seen for open reading frame 6 (ORF6), ORF3a, N, ORF8, and within Orf1a/b, where nsp3, nsp12, nsp4, nsp6, nsp2, and nsp14 were more prominently recognized. These mapped epitope results at the ORFeome level partially overlap with the ORFs targeted by CD4 + T cells in COVID-19 cases (4). No epitopes derived from the membrane protein (M) were identified in unexposed individuals (Fig. 1B), but M is robustly recognized by SARS-CoV-2-specific CD4 + T cell responses in COVID-19 cases (4). The lack of quality class II epitopes in M was unsurprising based on M molecular biology: M is a small protein with three transmembrane domains. Combined, the data indicate that class II epitopes are relatively broadly available across the SARS-CoV-2 genome but that SARS-CoV-2 memory CD4 + T cells preferentially target proteins highly expressed during infection, as exemplified by M and S (spike) epitope-mapping results.

Sequence homology of the identified SARS-CoV-2 epitopes to other common HCoVs
When this epitope-mapping study was initiated, an assumption was that the in vitro T cell culture epitope mapping would reveal an epitope repertoire associated with de novo generation of responses from naïve T cells. However, while these epitope-mapping studies were in progress, we and others detected significant ex vivo reactivity against bulk pools of SARS-CoV-2 peptides (3-7) and speculated that this might reflect the presence of memory T cells cross-reactive between HCoVs and SARS-CoV-2. These other HCoVs circulate widely in human populations and are typically responsible for mild, usually undiagnosed, respiratory illnesses such as the common cold (14)(15)(16). However, there is currently a lack of experimental data addressing whether memory CD4 + T cells that are cross-reactive between SARS-CoV-2 and other HCoVs do indeed exist.
We therefore next determined the degree of homology for all four widely circulating HCoVs for all 142 SARS-CoV-2 epitopes identified herein. For the analysis, we split the peptides into three groups based on immunogenicity as follows: (i) never immunogenic, (ii) immunogenic in one individual, or (iii) immunogenic in two or more individuals (Fig. 1C). There was significantly higher sequence similarity in peptides recognized by more than one individual compared with peptides recognized by a single individual or not recognized at all (P < 0.0001, two-tailed Mann-Whitney test). Additionally, almost all donors from the unexposed cohort used for the epitope screen were seropositive for three widely circulating HCoVs   Red indicates donor-epitope combinations with sequence identity >67% with common cold coronaviruses, and blue indicates highly reactive donor-epitope combinations (>1000 SFCs*10 6 ) with sequence identity ≤67%. In (C) and (D), statistical comparisons were performed with a two-tailed Mann-Whitney test. ***P < 0.001, ****P < 0.0001.
suggest that T cell cross-reactivity is plausible between SARS-CoV-2 and HCoVs already established in the human population.
To select the epitope subsets to be analyzed in more detail, we plotted the T cell response magnitude of each positive epitope per donor (Fig. 1D). This analysis confirmed the dominance of the spike antigen over the epitopes derived from the remainder of the genome (P < 0.001, two-tailed Mann-Whitney test).
Next, we selected two categories of SARS-CoV-2 epitopes of interest. The first category was epitopes with potential cross-reactivity from HCoVs. We initially selected the 67% arbitrary cutoff because we reasoned that a 9-mer is the epitope region involved in binding to class II (23) and that one or two residues in addition to the 9-mer core region are often required for optimal recognition (24) (Fig. 1D,  red). Second, we independently filtered for any epitopes associated with high responses (top~30%; Fig. 1D, blue). This resulted in the selection of 31 epitopes from spike (six with high homology and 25 for dominant responses) organized in a new CD4-[S31] pool. Similarly, we generated a new CD4-[R30] pool composed of 30 epitopes from the remainder of the genome (nine with high homology and 21 associated with strong responses ; Fig. 1D). These epitope pools were then used for further CD4 + T cell studies.
Next, we used an activation-induced marker assay (25)(26)(27) to detect virus-specific T cells in a new set of unexposed donors not used for the epitope identification studies ( Fig. 2A  and table S4) and a set of convalescent COVID-19 patients (table S5). We detected significant ex vivo CD4 + T cell responses against the SARS-CoV-2 nonspike (CD4-R) and spike (CD4-S) peptides compared with the negative control [dimethyl sulfoxide (DMSO)] (Fig. 2, B and C; P < 0.0001 and P < 0.0001, respectively, twotailed Mann-Whitney test). These responses were increased in COVID-19 cases compared with unexposed subjects ( Fig. 2D; P = 0.0015 and P = 0.0022, respectively, two-tailed Mann-Whitney test), as previously reported (4). In the unexposed subjects, significant frequencies of CD4 + T cells were detected against the CD4-R30 and CD4-S31 SARS-CoV-2 epitope pools compared with the negative control ( P = 0.0063 and P = 0.0012, respectively, twotailed Mann-Whitney test). Significant CD4 + T cell reactivity was also seen against the corresponding HCoV-R129 and HCoV-S124 pools of matching homologous peptides from other HCoVs ( Fig. 2D; P < 0.0001 and P < 0.0001, two-tailed Mann-Whitney test). Detection of CD4 + T cells with peptide pools selected on the basis of homology was consistent with the hypothesis that cross-reactive CD4 + T cells between SARS-CoV-2 and other HCoVs exist in many individuals.
Next, we examined the ex vivo memory phenotype of the T cells responding to the various epitope megapools. Results from one representative unexposed donor are shown in Fig. 3A. Responding cells in unexposed donors were predominantly found in the effector memory CD4 + T cell population (CD45RA neg CCR7 neg ), followed by the central memory T cells (CD45RA neg CCR7 pos ) (30) (Fig. 3, A, B, and D). Comparable patterns of effector and central memory cells were observed among the antigenspecific CD4 + T cells detected in the COVID-19 cases (Fig. 3, C and D). The CD4 + T cells in unexposed donors that recognize SARS-CoV-2 epitopes and epitopes from other HCoVs have a memory phenotype. Overall, these data are consistent with the SARS-CoV-2-reactive CD4 + T cells in unexposed subjects being HCoVspecific memory CD4 + T cells with crossreactivity to SARS-CoV-2.

Identification of SARS-CoV-2 epitopes cross-reactive with other common HCoVs
The epitopes derived from the CD4-R30 and CD4-S31 pools were used to generate shortterm T cell lines derived by stimulation of PBMCs from unexposed subjects. PBMCs were stimulated with an individual SARS-CoV-2 cognate epitope demonstrated to be recognized by T cells from that subject ( Fig. 1 and table S1). Overall, T cell lines could be derived that were specific for a total of 42 SARS-CoV-2 epitopes.
These T cell lines were next tested for crossreactivity against various coronavirus homologs, analogous to an approach previously successful in flavivirus studies (31). Crossreactivity between SARS-CoV-2 epitope recognition and other HCoV epitope recognition was detected for 10/42 (24%) of the T cell lines  ( Fig. 4, A to J). Cross-reactivity was associated with epitopes derived from SARS-CoV-2 spike, N, nsp8, nsp12, and nsp13. In three cases, HCoV analogs were better antigens than the SARS-CoV-2 peptide, suggesting that they may be the cognate immunogen (Fig. 4, E, I, and J). One SARS-CoV-2 spike epitope was tested in two different donors with similar findings, suggesting that HCoV cross-reactivity patterns are recurrent across individuals. Non-cross-reactive SARS-CoV-2 T cell lines are also shown (Fig. 4, K to L, and fig. S4). It is possible that crossreactivity to these epitopes might be detected if T cell lines from additional individuals were to be tested. In addition, these epitopes might be homologous to some other, as yet uniden-tified viral sequence or be recognized by cognate naive T cells expanding in the in vitro culture (32). In addition, only 3/18 cases of strong response epitopes (defined in Fig. 1D S8). These data demonstrate that memory CD4 + T cells recognizing common cold coronaviruses including HCoV-OC43, HCoV-HKU1, HCoV-NL63, and HCoV-229E can exhibit substantial cross-reactivity to the homologous epitope in SARS-CoV-2.
Next we examined, for each SARS-CoV2: HCoV epitope pair, the degree of amino acid sequence homology and any relationship between homology and T cell cross-reactivity, considering different ranges of potentially relevant homology. Only 1% (1/99) of peptide pairs with 33 to 40% homology were crossreactive. In the 47 to 60% epitope homology range, we observed cross-reactivity in 21% of cases (7/33). Epitope homology ≥67% was associated with cross-reactivity in 57% of cases (21/  40% range epitopes or the 47 to 60% range, respectively). A relationship was observed between epitope homology and CD4 + T cell crossreactivity. The data demonstrated that the arbitrary selection used as described in Fig. 1D was indeed supported by the experimental data. Thus,~67% amino acid homology appears to be a useful benchmark for consideration of potential cross-reactivity between class II epitopes. In summary, we have identified more than 140 human T cell epitopes derived from across the genome of SARS-CoV-2. We provide direct evidence that numerous CD4 + T cells that react to SARS-CoV-2 epitopes actually cross-react with corresponding homologous sequences from any of the many different commonly circulating HCoVs, and that these reactive cells are largely canonical memory CD4 + T cells. These findings of cross-reactive HCoV T cell specificities are in stark contrast to HCoV-neutralizing antibodies, which are HCoV species specific and did not show cross-reactivity against SARS-CoV-2 RBD (33)(34)(35). On the basis of these data, it is plausible to hypothesize that preexisting cross-reactive HCoV CD4 + T cell memory in some donors could be a contributing factor to variations in COVID-19 patient disease outcomes, but at present this is highly speculative (36).