Naive human B cells engage the receptor binding domain of SARS-CoV-2, variants of concern, and related sarbecoviruses

Initial exposure to a pathogen elicits an adaptive immune response to control and eradicate the threat. Interrogating the abundance and specificity of the naive B cell repertoire drives understanding of how to mount protective responses. Here, we isolated naive B cells from 8 seronegative human donors targeting the SARS-CoV-2 receptor-binding domain (RBD). Single cell B cell receptor (BCR) sequencing identified diverse gene usage and no restriction on complementarity determining region length. A subset of recombinant antibodies produced by naive B cell precursors bound to SARS-CoV-2 RBD and engaged circulating variants including B.1.1.7, B.1.351, and B.1.617.2, as well as pre-emergent bat-derived coronaviruses RaTG13, SHC104, and WIV1. By structural characterization of a naive antibody in complex with SARS-CoV-2 spike, we identified a conserved mode of recognition shared with infection-induced antibodies. We found that representative naive antibodies could signal in a B cell activation assay, and by using directed evolution we could select for a higher affinity RBD interaction, conferred by a single amino acid change. Additionally, the minimally mutated, affinity-matured antibodies potently neutralized SARS-CoV-2. Understanding the SARS-CoV-2 RBD-specific naive repertoire may inform potential responses capable of recognizing future SARS-CoV-2 variants or emerging coronaviruses enabling the development of pan-coronavirus vaccines aimed at engaging protective germline responses.


INTRODUCTION
Initial exposure to viral antigens by natural infection or vaccination primes an immune response and often establishes immune memory which can prevent or control future infections. The naive repertoire contains potential B cell receptor (BCR) rearrangements capable of recognizing these antigens, which are often surface-exposed glycoproteins. An early step in generating humoral immunity involves activation of these naive B cells through recognition of a cognate antigen (1) which in turn can lead to affinity maturation through somatic hypermutation (SHM) and subsequent differentiation (2). The initial engagement of the naive repertoire begins this cascade and often coincides with the eventual generation of a protective or neutralizing antibody response (3).
For SARS-CoV-2, the etiological agent of COVID-19, the development of a neutralizing antibody response after primary infection or vaccination is associated with protection against reinfection in non-human primates (4,5). In humans, the presence of neutralizing antibodies can predict disease severity and survival after primary SARS-CoV-2 infection (6) or vaccination (7). Furthermore, the two arms of humoral immune memory, long-lived bone marrow plasma cells (8) and circulating memory B cells (9,10), are induced by natural infection in humans and may persist for at least 8 months after primary infection, providing potentially durable long-term protection. Comparable levels of neutralizing antibody titers are present in convalescent COVID-19 subjects and vaccine recipients (11) further supporting the role of adaptive immune responses in helping to control and prevent disease severity.
Both infection and vaccine-elicited antibodies predominantly target the major SARS-CoV-2 envelope glycoprotein, spike, present on the virion surface. A substantial component of the neutralizing response engages the receptor binding domain (RBD) and does so by directly blocking interactions with the ACE2, the host receptor for viral entry (12). Isolated RBD-directed monoclonal antibodies are derived from diverse heavy-and light-chain variable gene segments, suggesting that multiple biochemical solutions for developing RBD-directed antibodies are encoded within the human B-cell repertoire (13,14). Potential immunogenicity of this antigenic site is based on the human naive B cell repertoire, and the overall frequency of naive BCRs that have some level of intrinsic affinity to stimulate their elicitation (15)(16)(17). However, antigen-specificity of naive B cells is largely undefined.
Traditional approaches for studying antigen-specific naive B cells include bioinformatic mining of available BCR datasets and inference of likely germline precursors by "germlinereverting" mature BCR sequences. This can be limited by the availability of heavy and light chain paired sequence data, and unreliable complementarity-determining region 3 (CDR3) loop approximation, respectively. Here, we address this limitation by characterizing human naive B cells specific for the SARS-CoV-2 RBD directly from the peripheral blood of seronegative donors to understand their relative abundance, intrinsic affinity, and potential for activation. Furthermore, we asked whether the SARS-CoV-2 specific naive repertoire could also engage related circulating variants of concern and pre-pandemic coronaviruses (CoVs). We find that SARS-CoV-2 RBD-specific naive B cells were of unrestricted gene usage and several isolated B cells had affinity for circulating SARS-CoV-2 variants and related CoV-RBDs. We determined the structure of a representative naive antibody that binds the SARS-CoV-2 RBD with a mode of recognition similar to a multi-donor class of antibodies prevalent in human responses to SARS-CoV-2 infection. Furthermore, we improved the affinity of two representative naive antibodies for RBD and showed that the starting naive specificity dictated the breadth of evolved clones to circulating variants. The analysis of the human naive antigen-specific B cell repertoire for the SARS-CoV-2 RBD and its capacity to recognize related variants and emerging CoVs may inform the rational design of epitope-focused immunogens for next generation vaccines.

Isolated SARS-CoV-2-specific naive B cells are genetically diverse
To measure the reactivity of naive human B cells specific for the SARS-CoV-2 RBD we adapted an ex vivo B cell profiling approach used previously to study epitope-specific naive precursors targeting neutralizing sites on HIV-1 (18) and influenza virus surface glycoproteins (15). We designed a SARS-CoV-2 RBD construct that positions two glycans at residues 475 and 501 to selectively block binding to ACE2 and the receptor-binding motif (RBM)-directed antibody, B38 (fig. S1) (19). Using this "ΔRBM" probe, in addition to wildtype SARS-CoV-2 spike, and RBD probes, we isolated naive (CD19 + /IgD + /IgG − ) B cells specific to the RBD and, more finely, the RBM from the peripheral blood of 8 SARS-CoV-2 seronegative human donors ( Fig. 1A and fig. S1E). We defined RBM-specificity as B cells that bound to fluorescently labeled spike and RBD, but not the ΔRBM probe ( fig. S2A). Although rare, all 8 donors had detectable populations of RBM-specific naive B cells ( fig. S2B). The median frequency of RBM-specific B cells among total and naive B cells was 0.0025% and 0.0029%, respectively (Fig. 1B). Within spike-reactive naive cells, the median frequency of RBM-specific B cells was 3.6% (Fig. 1C). This suggests that a large proportion of spike epitopes targeted by naive responses reside outside of the RBD. While different flow cytometric probes were used, recent studies suggest that the frequency of RBD-specific B cells in convalescent and vaccinated individuals may increase approximately 10-fold upon exposure to antigen (11). Furthermore, the majority of IgD + RBM-specific B cells were CD27 − (mean frequency ~97%), in agreement with the naive B cell phenotype ( fig. S2C).
To understand in more detail the properties of this naive repertoire, we obtained 163 paired heavy-and light-chain antibody sequences from 5 of the 8 donors. Sequence analysis showed that all clones were unique with diverse gene usage for both heavy and light chains, and minimal gene pairing preferences (Fig. 1D). These data reflect the polyclonal gene usage observed in RBD-specific memory B cells sequenced from COVID-19 convalescent individuals (13,14) and vaccine recipients (11), suggesting that a diverse pool of antibody precursors can be activated upon antigen exposure. By comparing this naive repertoire to gene usage distribution from non-SARS-CoV-2-specific repertoires (20), we observed an increase in mean repertoire frequency of ~20% for IGHV3-9 in 4 out of 5 sequenced donors ( fig. S3A). Notably, this enrichment of IGHV3-9 was also observed in isolated memory B cells from convalescent individuals (21) and vaccine recipients (11), as well as in expanded IgG + B cells sequenced from a cohort of COVID-19 subjects during acute infection (22). These expanded clones, detected shortly after onset of symptoms, displayed low levels of SHM, suggesting potential IGHV3-9 usage in an early extrafollicular response in which naive B cells differentiate into short-lived plasma cells (23). Additionally, IGHV3-53 and 3-30 gene segments, over-represented in RBD-specific antibodies isolated from convalescent subjects (24), were recovered from three sequenced donors (13 total clones; ~8.0% of total). The amino acid length of heavy and light chain third complementarity-determining regions (CDR3) ranged from 8 to 27 (average length ~16) for HCDR3 and 4 to 13 (average length ~10) for LCDR3 (Fig. 1E). These lengths are normally distributed relative to both unselected human repertoires (20) and RBD-specific memory B cell repertoires (24). These data suggest that overall HCDR3 length does not restrict precursor frequency and there appears no inherent bias for CDR3 length conferring RBM-specificity. The majority of obtained sequences were at germline in both the variable heavy (V H ) and light (V L ) chains. However, despite sorting B cells with a naive phenotype, some sequences were recovered that deviated from germline. Specifically, the V H ranged from 91.4 to 100% identity to germline, with a median of 99.7%; the V L ranged from 93.6 to 100%, with a median of 99.3% (Fig. 1F, fig. 2B, C).

Naive antibodies engage SARS-CoV-2 RBD with high affinity
To obtain affinities of the isolated naive antibodies, we cloned and recombinantly expressed 38 IgG antibodies selected to reflect the polyclonal RBD-specific repertoire with representatives from diverse variable region gene segments and 32 unique heavy chain-light chain pairings (Table S1). Additionally, we ensured diversity in terms of HCDR3 length, kappa and lambda usage, as well as representation from all 5 donors. By ELISA, we identified IgGs with detectable binding to SARS-CoV-2 RBD; we summarize these results for all antibodies ( Fig. 2A) and parsed by donor ( fig. S3D). Across 5 donors, 33 (~87%) bound to monomeric SARS-CoV-2 RBD ( Fig. 2A) with EC 50 values ranging from 3.3 to 410 nM and a mean of 59 nM ( Fig. 2A and fig. S3E). Of the binding population, there was no apparent predisposition for HCDR3 length or light chain pairing (Fig. 2C, D). We further defined the epitopic region of these IgGs using the ΔRBM construct and the individual glycan variants, Δ501 and Δ475, both of which independently block ACE2 cell-surface binding but are on opposite sides of the RBM (fig. S1E, F). 9 IgGs had no detectable ΔRBM binding (e.g., ab079, ab119), while 20 IgGs had reduced ELISA binding relative to wild-type RBD, reflected in the reduced ΔRBM median EC 50 values ( fig. S3E). We also identified examples of antibodies sensitive to only Δ475 (e.g., ab185) and only Δ501 (e.g., ab007) ( Fig. 2A and fig. S3E).
To obtain binding kinetics independent of avidity effects from bivalent IgGs, 12 antibodies were selected for expression as antigen binding fragments (Fabs) to determine monovalent binding affinity (K D s) by biolayer interferometry (BLI). Using monomeric RBD as the analyte, 10 of the 12 Fabs had detectable binding with K D s ranging from ~6.5 to ~75 μM; the other two remaining Fabs (ab177, ab185), gave unreliable affinity measurements (i.e., >100 μM) ( fig. S4). Notably, all Fabs had characteristically fast off rates (k off ). This observation is consistent for germline B cells where fast off-rates are compensated by avidity due to overall BCR surface density (25); subsequent affinity gains via SHM often result in slowing of the off-rate and is a canonical mechanism of improved antigen binding (26).

Naive antibodies engage SARS-CoV-2 variants of concern
The emergence of SARS-CoV-2 variants with mutations in RBD has raised significant concern that antigenic evolution will impair recognition of RBD-directed antibodies elicited by prior infection and vaccination with an antigenically distinct SARS-CoV-2 variant (27). We therefore asked whether these naive antibodies, isolated using wild-type SARS-CoV-2 RBD, could recognize circulating viral variants, B.1.1.7 (Alpha; mutations N501Y), B.1.351 (Beta; mutations K417N/E484K/N501Y), and B.1.617.2 (Delta; L452R/T478K); the latter has become the most prevalent circulating variant in 2021. We find that ~89% of the antibodies with wild-type RBD affinity also bound to the B.1.1.7 variant with a comparable mean affinity of 68.7 nM (Fig. 2D, F). For B.1.351, a concerning variant formerly prevalent in South Africa, 50% of the wild-type SARS-CoV-2 RBD binding IgGs also bound to the B.1.351 variant, many of which displayed reduced ELISA binding relative to wild-type RBD with a mean affinity of 219 nM (Fig. 2E, F). Finally, ~58% of the antibodies with affinity for wild-type RBD also bound to the B.

Naive antibodies engage pre-emerging coronaviruses
We next tested the cross-reactivity of these naive antibodies to related sarbecovirus RBDs, which also use ACE2 as a host receptor (29). Our panel included the previously circulating SARS-CoV RBD and representative pre-emergent bat CoV RBDs from WIV1, RaTG13, and SHC014. These RBDs share ~73-90% paired-sequence identity with the highest degree of amino acid conservation in residues outside of the RBM. 13 antibodies cross-reacted with at least one additional RBD in our panel, with decreasing affinity for RBDs with more divergent amino acid sequence identity ( Fig. 2A, G). Notably, ab017, ab072, ab109, and ab114 had broad reactivity to all tested sarbecovirus RBDs, suggesting binding to highly conserved epitopes. Of these cross-reactive antibodies, ab017 and ab114, derive from the same IGHV3-33 and IGVL2-14 paring but were isolated from different donors, suggesting a shared or public clonotype.

Naive antibodies are not polyreactive and do not engage seasonal coronaviruses
Prior studies have shown that germline antibodies are more likely to display polyreactivity relative to affinity-matured antibodies with higher levels of SHM from mature B cell compartments (30). We therefore tested the polyreactivity of all 44 naive antibodies using three common autoantigens, double-stranded DNA (dsDNA), Escherichia coli lipopolysaccharide (LPS), and human insulin by ELISA ( Fig. 2A) We observed no polyreactivity across any of the naive antibodies, including those that are broadly reactive. Furthermore, none of the naive antibodies bound RBDs from the human seasonal betacoronaviruses (hCoVs), OC43 and HKU1 ( Fig. 2A), which share 22 and 19% pairedsequence identity to SARS-CoV-2 RBD, respectively. Together, these results suggest that the isolated naive B cells encode BCRs with specificity to sarbecoviruses.

In vitro reconstitution of naive B cell activation
Physiological interactions between a naive BCR and cognate antigen occurs at the B cell surface. Naive BCRs are displayed as a bivalent membrane-bound IgM and multivalent antigen binding can initiate intracellular signaling resulting in an activated B cell with the capacity to differentiate to antibody secreting plasma cells or memory cells (31). To determine whether the isolated RBD-specific naive BCRs have the capacity to be activated, we generated stable Ramos B cell lines expressing ab090 or ab072 as cell-surface BCRs and measured their activation by monitoring calcium flux in vitro (32). These antibodies were selected to represent divergent germline gene usage and specificities: 1) ab090 (IGHV1-2/ IGKV3-15) bound SARS-CoV-2 and variant B.1.1.7 RBDs, but not variant B.1.351 and WIV1 RBDs (Fig. 3A); and 2) ab072 (IGHV3-23/IGLV2-14) had broad reactivity to all RBDs (Fig. 3B). To assess BCR activation, we generated ferritin-based nanoparticles (NPs) for multivalent RBD display using SpyTag-SpyCatcher (33); these RBD-NPs included SARS-CoV-2, B.1.1.7, B.1.351 and WIV RBDs. We found that ab090 expressing Ramos B cells were only activated by SARS-CoV-2 RBD and variant B.1.1.7 RBD NPs (Fig. 3C), while ab072 Ramos B cells were activated by all RBD-NPs (Fig. 3D). These data parallel the observed recombinant binding specificity of each antibody. Importantly, neither ab090 nor ab072 Ramos B cell lines were activated by influenza hemagglutinin NPs, suggesting that this activation is sarbecovirus RBD-specific (Fig. 3C, D).

ab090 engages the SARS-CoV-2 RBM
To further characterize the epitope specificity of a representative naive antibody, we determined the structure of ab090 in complex with SARS-CoV-2 spike (S) by electron cryomicroscopy (cryo-EM). A ~6.7-Å structure showed one Fab bound to an RBD in the "up" conformation ( Fig. 4A, B and fig. S5). Based on this modest resolution structure, we make the following general descriptions of the antibody-antigen interface. The interaction between ab090 and the RBD is mediated primarily by the antibody heavy chain, with the germline encoded HCDR1, HCDR2, and the framework 3 DE-loop centered over the RBM epitope (Fig. 4B). The ab090 light chain is oriented distal to the RBD and does not appear to substantially contribute to the paratope (Fig. 4B). IGHV1-2 antibodies represent a prevalent antibody class in human responses to SARS-CoV-2 infection, many of which display high neutralization potency (34). ab090 shares a V H -centric mode of contact and angle of approach similar to members of this class of infection-elicited antibodies (Fig. 4D), despite varying HCDR3 lengths and diverse light chain pairings (Fig. 4D). Additionally, members of the IGHV1-2 antibody class contain relatively few SHMs ( fig. S5B). In conjunction with the structure, we biochemically defined the sensitivity of ab090 to variant B.1.351 by testing the binding to individual mutations. Binding affinity was detected to SARS-CoV-2 RBDs with either N501Y or K417N mutations, but not to E484K alone (Fig. 4C). Based on the structure, the E484K mutation, is grossly positioned proximal to the CDRH2 loop (Fig.  4C), which has a germline-encoded motif critical for IGHV1-2 antibody binding to RBD ( fig. S5B) (34). Indeed, infection-elicited IGHV1-2 antibodies are susceptible to escape by E484K alone, which disrupts a CDRH2 hydrogen binding network (35). Together, the cryo-EM structure and binding data suggest that ab090 represents a precursor of a class of RBM directed SARS-CoV-2 neutralizing antibodies.

In vitro affinity-matured naive antibodies retain intrinsic specificity
After initial antigen recognition and subsequent activation, naive B cells can undergo successive rounds of somatic hypermutation within the germinal center (GC) that ultimately result in higher affinity antibodies for the cognate antigen. To determine how somatic hypermutation might influence overall affinity and specificity, we used yeast surface display to in vitro mature ab072 and ab090. We randomly mutagenized the single chain variable fragment (scFv) variable heavy and light chain regions to generate ab072 and ab090 variant display libraries. After two rounds of selection using SARS-CoV-2 RBD, we enriched the ab072 and ab090 libraries for improved binding over their respective parental clones (Fig.  5A, D and fig. S6A). We also observed increased binding to B.1.351 for the ab072 library but not for ab090; notably this corresponded with the respective specificity of the parent clones (Fig. 5A, D).
We next isolated and sequenced individual clones from the enriched libraries. For ab090, we observed a dominant mutation, R72H, in the FRWH3 region present in 60% of sequenced clones ( fig. S7B). Notably, multiple mutations at position 72 conferred a ~3-to 5-fold improvement in monovalent affinity relative to parental ab090 for wild-type and B.1.1.7 RBDs, with no detectable B.1.351 binding for affinity matured progeny (Fig. 5B, C). In addition, ab090 progeny displayed a ~10-fold improvement in ELISA binding affinity (EC50) to B.1.617.2 RBD ( fig. S6E). We observed no mutations within the light chain which is consistent with the V H -centric binding mode in the cryo-EM structure (Fig. 4). For the broadly reactive ab072, isolated clones had mutations in both the V H and V L and ~35% of the sequenced clones had mutation S31P in the HCDR1 ( fig. S7B, C). There was 3-to ~5-fold improvement in monovalent affinity of ab072 progeny relative to parent for SARS-CoV-2, B.1.1.7 and B.1.351 RBDs (Fig. 5E, F) and a ~14-fold improvement in binding to B.1.617.2 RBD by ELISA ( fig. S6F). Collectively, these data identify potential mutations that can improve affinity while retaining initial parental antigen specificity.

SARS-CoV-2 pseudovirus neutralization by naive and affinity-matured Abs
We next used a SARS-CoV-2 pseudovirus assay (6) to ask whether any of the isolated naive antibodies and affinity matured clones were capable of blocking transduction of target cells. We found that of the 36 RBD-binding antibodies tested in this assay, 5 had detectable levels of neutralization (~14%) (Fig. 6A). These antibodies, obtained from multiple donors, have no commonality with respect to their gene usages and HCDR3 lengths (Fig. 6B). While these naive antibodies were not as potent as B38, isolated from a memory B cell, the observation that the naive repertoire has antibodies that neutralize is noteworthy, nevertheless.
To determine whether improved affinity correlated with enhanced neutralization potency, we evaluated the affinity matured progeny of ab090 in a SARS-CoV-2 pseudovirus neutralization assay (Fig. 6B). We find that all three ab090 progeny that had higher affinity for SARS-CoV-2 RBD also had increased neutralization potency. ab090_A08 bearing the R72H mutation had the highest affinity gain and was the most potent neutralizer with a K D of 1.7μM and an IC50 of 0.37 μg/ml, respectively. Notably, ab090 progeny had IC50 values similar to other IGHV1-2 memory B cells isolated from convalescent donors (34); this increase in potency is conferred through minimal somatic hypermutation.

DISCUSSION
The development of a protective humoral immune response upon infection or vaccination relies on the recruitment, activation, and maturation of antigen-specific naive B cells. However, the specificity of the naive B cell repertoire remains largely undefined. Here, we showed that coronavirus-specific naive B cells are present across distinct seronegative donors, are of unrestricted gene usage and when recombinantly expressed as IgGs, have affinity for SARS-CoV-2 RBD, circulating variants of concern, and at least four related coronaviruses. These data suggest that RBD-specific precursors are likely present across a large fraction of individual human naive repertories, consistent with longitudinal studies of SARS-CoV-2 infected individuals in which most convalescent individuals seroconverted with detectable RBD serum antibodies and neutralization titers (9,36). The naive B cells characterized here engage epitopes across the RBD with a range of angles of approach as defined by our glycan variant probes and cross-reactivity profiles; this is also consistent with infection and vaccine elicited, RBD-specific repertoire characterized by epitope-mapping, deep mutational scanning and structural analyses (37,38). Having naive BCRs recognizing distinct or partially overlapping epitopes across the RBD may be advantageous for eliciting a polyclonal response more able to recognize variants of concern.
The presence of broadly reactive naive B cells inherently capable of recognizing sarbecovirus RBDs and circulating variants suggests that these precursors could be vaccineamplified. Recent work showed that uninfected individuals have pre-existing SARS-CoV-2 S-reactive serum antibodies (39) and memory B cells (40) which cross-react with hCoVs and can be boosted upon SARS-CoV-2 infection. These cross-reactive antibodies appear to be specific to the S2 domain and are predominantly IgG or IgA. This observation contrasts the cross-reactive B cells described here that engage the RBD, have no reactivity to hCoVs and are IgG − naive B cells suggesting that they are distinct from previously described S-reactive pre-existing antibodies.
The competitive success of a naive B cell within a GC is influenced by precursor frequencies and antigen affinities (41). However, the biologically relevant affinities necessary for activation and GC entry remain unclear-indeed several studies suggest that B cell activation and affinity maturation is not restricted by immeasurably low affinity BCR interactions (42,43). Recent studies involving naive precursors of receptor-binding site (RBS) directed HIV-1 broadly neutralizing antibodies (bnAbs) contributed to our understanding of these parameters (16,17). Using an in vivo murine adoptive transfer model, these RBS-directed precursors were recruited into a GC reaction at a precursor frequency of ~1:10,000 and a monovalent antigen affinity of 14μM (17). For comparison, here we defined the SARS CoV-2 RBM-specific naive precursor frequency as 1:42,000 by flow cytometric gating ( fig. S2D) with monovalent affinities ranging from 6.5 to >100μM. These data suggest that these isolated naive B cells, especially those with demonstrable monomeric affinity, could be readily elicited upon antigen exposure. However, longitudinal studies tracking antigen specific naive B cells pre-and post-exposure are required to determine the fate (i.e., plasma cell, memory, or germinal center B cell compartments) of potential precursors and define relevant naive affinities for elicitation by SARS-CoV-2.
Precursor frequencies have recently been interrogated for human naive B cells targeting epitopes of broadly neutralizing HIV-1 and influenza antibodies, defined as 1:300,000 and 1:10,000, respectively (15,44). While the frequency of SARS-CoV-2 RBM-specific naive precursors defined here (1:42,000) is within the range defined in these studies, we note that in each case precursor frequency will be unique for the antigen probe and will depend on the valency, affinity, and presentation of the target epitope. Further, frequency alone does not account for SHM needed to attain neutralization potency.
Through biochemical and structural analyses, we characterized a naive antibody, ab090, which resembles a commonly elicited class of potent neutralizing antibodies utilizing the IGHV1-2 gene (34). This class of antibodies share restricted binding specificity for wild-type SARS-CoV RBD (the vaccine strain) and the prevalent B.1.1.7 variant. This recombinant binding pattern also paralleled the reconstituted in vitro B cell activation dynamics of ab090 in the highly avid assay with the capacity to detect immeasurably low affinity interactions (25). In vitro affinity maturation of ab090 against corresponded to a single H-FR3 mutation, which improved monovalent affinity ~5-fold to wild-type SARS-CoV-2 and B.1.1.7 RBDs relative to parent and pseudovirus neutralization to IC50 values less than 1μg/ml. This observation is consistent with the low levels of SHM within IGHV1-2 neutralizing antibodies (34), with reports of other potent RBD-directed neutralizing antibodies with a limited level of somatic hypermutation (13,14,22), and reports of neutralizing antibodies isolated in vitro from naive human antibody libraries (45). While in vitro affinity gains and neutralization potency are generally correlated (46), we note that affinity does not necessarily correlate to neutralization potency for all SARS-CoV-2 RBD targeting antibodies, where fine epitope specificity appears to be most relevant (47).
Probing and characterizing the human naive B cell antigen-specific repertoire can identify precursors for vaccine or infection-specific naive B cells and expand our understanding of basic B cell biology. Germline-endowed specificity for neutralizing antibody targets on the RBD may also contribute to the strong clinical efficacy observed for the current SARS-CoV-2 vaccines (48,49). Furthermore, understanding the naive B cell repertoire to potential pandemic coronaviruses may reveal commonalties in antigen-specific precursors, enabling the development of pan-coronavirus vaccines aimed at engaging broadly protective germline responses.

Study Design
The aim of this study was to perform an in-depth analysis of the human naive B cell repertoire specific for a major target of SARS-CoV-2 neutralizing antibodies, the receptor binding domain (RBD) of SARS-CoV-2. We isolated naive B cells from the peripheral blood of 8 SARS-CoV-2 seronegative adults, sequenced the BCRs, and recombinantly expressed a diverse panel of IgGs from 5 of these donors for biochemical characterization. Sample sizes for each experiment are indicated in the figure legends.

Donor Samples
PBMCs for single cell sorting were isolated from 8 blood donors from the MGH blood donor center. Prior to donation, subjects signed a donor consent statement, stating "I give permission for my blood to be used for transfusion to patients or for research". The gender as well as the age/developmental stage of the patients is not recorded by the MGH blood donor center, however eligible donors must be a minimum of 16 years of age and weigh a minimum of 110lbs. All experiments were conducted with MGH Institutional Biosafety Committee approval (MGH protocol 2014B000035). Control human convalescent sera was obtained under the approved Partners Institutional Review Board (protocol 2020P000895) (6).

Expression and purification of recombinant CoV Antigens
Plasmids

Expression and purification IgGs and Fabs
IgG and Fab genes for the heavy-and light-chain variable domains were synthesized, and codon optimized by IDT and subcloned into pVRC8400 protein expression vectors (50) and sequence confirmed (Genewiz). Fabs and IgGs were similarly expressed and purified as described above for RBDs. IgGs were buffer exchanged into PBS while Fabs were further purified by size exclusion chromatography (50).

ELISA
Sera and monoclonal antibody reactivity to CoV antigens were assayed by ELISA. Briefly, 96-well plates (Corning) were coated with 5 μg/ml of monomeric RBDs in PBS at 100μl/ ELISAs for polyreactivity against human insulin (MilliporeSigma) and double-stranded DNA (dsDNA) (Calf Thymus DNA; Invitrogen), plates were coated with 2μg/ml and 50μg/ml, respectively, in PBS and incubated overnight at 4°C. Plates were blocked and incubated with IgGs as described above. Lipopolysaccharide (LPS) ELISAs were measured as described (51). Briefly, plates were coated with 30μg/ml LPS (Escherichia coli O55:B5; MilliporeSigma) in carbonate buffer for 3hrs at 37°C, washed three times with water, and air-dried overnight at RT. Coated plates were blocked with HS buffer (50mM HEPES, 0.15mM NaCl, pH 7.4) plus 10mg/ml BSA. Plates were incubated with IgGs diluted in HS buffer containing 1mg/ml BSA for 3hrs at 37°C, washed three times with HS buffer, and developed as described above. ELISAs were performed at a single IgG concentration (15μg/ml) in replicate with positive binding was defined by an OD 405 ≥ 0.30.

ACE2 cell binding assay
ACE2 expressing 293T cells were a gift from Michael Farzan (Scripps Florida) and Nir Hacohen (Broad Institute) (6) and were incubated with 200 nM of RBD antigen in PBS for 1hr on ice. Cells were resuspended in 50μL of secondary stain containing streptavidin-PE (Invitrogen) at a 1:200 dilution and incubated for 30 min on ice. Cell binding was analyzed by flow cytometry using a Stratedigm S1300Exi Flow Cytometer equipped with a 96 well plate high throughput sampler. Resulting data were analyzed using FlowJo (10.7.1).

Probe Generation
SARS-CoV-2 RBD and ΔRBM constructs were expressed as dimeric murine-Fc (mFc; IgG1) fusion proteins containing a HRV 3C-cleavable C-terminal 8xHis and SBP tags and purified as described above. SBP-tagged RBD-and ΔRBM-mFc dimers were mixed with fluorescently labeled streptavidin, SA-BV650 and SA-BV786 (BioLegend), to form RBD-mFc-BV650 and ΔRBM-mFc-BV786 tetramers. SARS-CoV-2 spike with a C-terminal Strep II tag was labeled separately with StrepTactin PE and APC (IBA) to form spike-PE and -APC tetramers, respectively. Both labeling steps were performed for 30 min at 4 °C prior to sorting.

BCR Sequencing
BCR Sequencing was carried out as described previously (15). Briefly, whole transcriptome amplification (WTA) was performed on the sorted cell-lysates according to the Smart-Seq2 protocol (52). Heavy and light chain sequences were amplified utilizing partially degenerate pools of V region specific primers (Qiagen HotStar Taq Plus). Heavy and light chain amplifications were carried out separately. Cellular barcodes and index adapters (based on Nextera XT Index Adapters, Illumina Inc.) were added using a step-out PCR method. Amplicons were pooled and sequenced using a 250x250 paired end 8x8 index reads on an Illumina Miseq System. Data were demultiplexed, heavy and light chain reads were paired, and overlapping sequence reads were obtained (Panda-Seq) (53) and aligned against the human IMGT database.

Interferometry binding experiments
Interferometry experiments were performed using a BLItz instrument (ForteBio). Fabs (0.1 mg/ml) were immobilized on Ni-NTA biosensors. The SARS-CoV-2 RBD analyte was titrated (10μM, 5μM, 2.5μM, and 1μM) to acquire binding affinities; the K D was obtained through global fit of the titration curves by applying a 1:1 binding isotherm using vendorsupplied software.

Pseudotyped neutralization assay
SARS-CoV-2 neutralization was assessed using lentiviral particles pseudotyped as previously described (6). Briefly, lentiviral particles were produced via transient transfection of 293T cells. The titers of viral supernatants were determined via flow cytometry on 293T-ACE2 cells and via the HIV-1 p24 CA antigen capture assay (Leidos Biomedical Research, Inc.). Assays were performed in 384-well plates (Grenier) using a Fluent Automated Workstation (Tecan). IgGs starting at 150 μg/ml, were serially diluted (3-fold) in 20μL followed by addition of 20 μL of pseudovirus containing 250 infectious units and incubated at room temperature for 1 hr. Finally, 10,000 293T-ACE2 cells in 20 μL cell media containing 15 μg/ml polybrene were added to each well and incubated at 37 °C for 60-72 hrs. Following transduction, cells were lysed and shaken for 5 min prior to quantitation of luciferase expression using a Spectramax L luminometer (Molecular Devices). Percent neutralization was determined by subtracting background luminescence measured from cells control wells (cells only) from sample wells and dividing by virus control wells (virus and cells only). Data were analyzed using Graphpad Prism.

Cryo-EM sample preparation, data collection and processing
SARS-CoV-2 spike HexaPro was incubated with ab090 Fab at 0.6 mg/mL at a molar ratio of 1.5:1 Fab:Spike for 20 minutes at 4°C and two 3 μl aliquots were applied to UltrAuFoil gold R0.6/1 grids and subsequently blotted for 3 seconds at blot force 3 twice, then plunge-frozen in liquid ethane using an FEI Vitrobot Mark IV. Grids were imaged on a Titan Krios microscope operated at 300 kV and equipped with a Gatan K3 Summit direct detector. 10,690 movies were collected in counting mode at 16e − /pix/s at a magnification of 81,000, corresponding to a calibrated pixel size of 1.058 Å. Defocus values were at around −2.00 μm. Micrographs were aligned and dose weighted using Relion's (54) implementation of MotionCorr2 (55). Contrast transfer function estimation was done in GCTF (56). Particles were picked with crYOLO (57) with a model trained with 12 manually picked micrographs with particle diameter value of 330Å. Initial processing was performed in Relion. Particles were binned to ~ 12Å/pixel and 2D classified. Selected particles were extracted to ~6Å/pixel then subjected to a second round of 2D classification. An initial model was generated at ~6Å/pixel and used as a reference for two rounds of 3D classification; first to select particles containing SARS-CoV-2 spike then to select particles containing both spike and ab090. Selected particles were unbinned then aligned using 3D auto-refine and subjected to a third round of 3D classification to select for a single class with SARS-CoV-2 spike bound with one ab090 Fab. Selected particles were aligned using 3D auto-refine before undergoing CTF refinement and Bayesian polishing. Polished particles were then simultaneously focusaligned relative to the RBD and ab090 region ( Figure S5A) to aid in model building of this region of interest and imported to cryoSPARC (58). Imported particles were aligned using non-uniform refinement and local resolution estimation ( Figure S5B). Non-uniform refined maps were then sharpened with DeepEMhancer then used to dock a previously built SARS-CoV-2-spike model (PDB ID 7LQW).

Cryo-EM model building
Backbone models were built by docking variable regions from PDB ID 2D2P and 6FG1 for heavy and light chains, respectively and a RBD (PDB 6M0J) into the focus refined maps using UCSF Chimera (59) variable regions were corrected and manually built using COOT (60). For the remainder of the spike, a previously published model (PDB ID 6VXX) was docked into the full, sharpened map in UCSF Chimera.

RBD nanoparticle production and conjugation
Monomeric SARS-CoV-2 wild-type, B.1.1.7, B.1.351, and WIV1 RBDs were recombinantly produced and purified as described above with an 8xHis and SpyTag at the C-terminus. Helicobacter pylori ferritin nanoparticles (NP) were expressed separately with N-terminal 8xHis and SpyCatcher tags. SpyTag-SpyCatcher conjugations were performed overnight at 4°C with a 4-fold molar excess of SpyTag-RBD relative to SpyCatcher-NP. The conjugated RBD-NPs were repurified by size-exclusion chromatography to remove excess RBD-SpyTag.

In vitro BCR stimulation
Briefly, BCRs for ab090 and ab072 were stably expressed in an IgM negative Ramos B cell clone by lentiviral transduction (32

In vitro affinity maturation of ab090 and ab072
Mutagenized yeast display libraries for ab090 and ab072 scFvs, by error-prone PCR using GeneMorph II Random Mutagenesis Kit (Agilent Technologies). Mutagenized scFv DNA products were combined with the linearized yeast display vector pCHA (61) and electroporated into EBY100 grown to mid-log phase in YPD media, where the full plasmid was reassembled by homologous recombination. The final library size was estimated to be ~4 x 10 7 . The scFv libraries were passaged in selective SDCAA media at 30°C and induced with galactose at 20°C (61). The scFv libraries were induced 10-fold their respective diversities and subjected to three rounds of selection with SBP-tagged SARS-CoV-2 RBD-Fc. Induced yeast libraries were stained for antigen binding (RBD-Fc APC tetramers) and scFv expression (chicken anti-c-myc IgY; Invitrogen, cat. no A-21281). Following two washes in PBS with 0.1% w/v BSA, yeast was stained with donkey anti-chicken IgY AF488 (Jackson ImmunoResearch, cat. no 703-545-155). Two gates were drawn for cells with improved RBD binding over parental clones, a more stringent "edge" gate represented ~1% and a "diversity" gate represented ~3-5% of the improved output. Clones from the final round of selection were isolated and Sanger sequenced for recombinant IgGs and Fabs expression.

Supplementary Material
Refer to Web version on PubMed Central for supplementary material.     (A) SARS-CoV-2 pseudovirus neutralization assay for 32 purified IgGs. Curves in color highlighted antibodies with neutralizing activity with donor and monovalent wild-type RBD affinity listed for this subset of antibodies. The neutralizing monoclonal antibody, B38, was used as a positive control. Dashed lines indicate IC 50 values and data represent means ± SD of two technical replicates. (B) SARS-CoV-2 pseudovirus neutralization for select affinity matured progeny from the ab090 lineage with respective mutations relative to ab090 parent sequence, monovalent wild-type RBD affinity, and IC50 listed.