Structure-based design of prefusion-stabilized SARS-CoV-2 spikes

The COVID-19 pandemic has led to accelerated efforts to develop therapeutics and vaccines. A key target of these efforts is the spike (S) protein, which is metastable and difficult to produce recombinantly. Here, we characterized 100 structure-guided spike designs and identified 26 individual substitutions that increased protein yields and stability. Testing combinations of beneficial substitutions resulted in the identification of HexaPro, a variant with six beneficial proline substitutions exhibiting ~10-fold higher expression than its parental construct and the ability to withstand heat stress, storage at room temperature, and three freeze-thaw cycles. A 3.2 Å-resolution cryo-EM structure of HexaPro confirmed that it retains the prefusion spike conformation. High-yield production of a stabilized prefusion spike protein will accelerate the development of vaccines and serological diagnostics for SARS-CoV-2.

(Page numbers not final at time of first release) 2 postfusion transition, salt bridges to neutralize charge imbalances, hydrophobic residues to fill internal cavities, and prolines to cap helices or stabilize loops in the prefusion state. We cloned 100 single S-2P variants and characterized their relative expression levels (table S1), and for those that expressed well we characterized their monodispersity, thermostability, and quaternary structure. Given that the S2 subunit undergoes large-scale refolding during the pre-topostfusion transition, we exclusively focused our efforts on stabilizing S2. Substitutions of each category were identified that increased expression while maintaining the prefusion conformation ( Fig. 1 and 2A). Overall, 26 out of the 100 single-substitution variants had higher expression than S-2P (table S1).
One common strategy to stabilize class I fusion proteins is to covalently link a region that undergoes a conformational change to a region that does not via a disulfide bond. For instance, the Q965C/S1003C substitution aims to link HR1 to the central helix, whereas G799C/A924C aims to link HR1 to the upstream helix. These two variants boosted protein expression 3.8-fold and 1.3-fold compared to S-2P, respectively (Fig. 2B). However, the size-exclusion chromatography (SEC) traces of both variants showed a leftward shift compared to S-2P, indicating that the proteins were running larger than expected, which agreed well with negative stain electron microscopy (nsEM) results that showed partially misfolded spike particles (fig. S1). Although introduction of disulfide bonds has been successful in the case of HIV-1 Env (SOSIP) and RSV F (DS-Cav1) (12,20), it generally had detrimental effects for SARS-CoV-2 S, but there were a few exceptions. The S884C/A893C and T791C/A879C variants eluted on SEC at a volume similar to S-2P and were well-folded trimeric particles by nsEM (Fig.  2E). These variants link the same α-helix to two different flexible loops that pack against a neighboring protomer ( Fig.  1). Notably, S884C/A893C had two-fold higher expression than S-2P with slightly increased thermostability (Fig. 2, F and G).
Introducing a salt bridge at the HIV-1 gp120-gp41 interface has been previously shown to boost expression and enhance the binding of trimer-specific antibodies (21). Based on a similar principle, the T961D and G769E substitutions were introduced to form inter-protomeric electrostatic interactions with Arg765 and Arg1014, respectively (Fig. 1). Both variants increased expression and resembled wellfolded trimeric spikes (Fig. 2, C and E, fig. S2, and table S1). In addition to salt bridges, filling loosely packed hydrophobic cores that allow the protein to refold can help stabilize the prefusion state, as shown by previous cavity-filling substitutions in RSV F and HIV-1 Env (12,20,22). Here, the L938F substitution was designed to fill a cavity formed in part by HR1, the fusion peptide and a β-hairpin (Fig. 1). This substitution resulted in a 2-fold increase in expression (Fig.  2C) that was additive in combination with disulfide or proline substitutions (table S2).
Previous successes using proline substitutions inspired us to investigate 14 individual variants wherein a proline was substituted into flexible loops or the N-termini of helices in the fusion peptide, HR1, and the region connecting them (CR) (Fig. 2, D and G, and table S1). As expected, multiple proline variants boosted the protein expression and increased the thermostability (Fig. 2, D, F, and G). Two of the most successful substitutions, F817P and A942P, exhibited 2.8 and 6.0-fold increases in protein yield relative to S-2P, respectively. The A942P substitution further increased the melting temperature (Tm) by ~3°C, and both variants appeared as well-folded trimers by nsEM ( Fig. 2E and fig.  S2). This result is reminiscent of previous successful applications of proline substitutions to class I fusion proteins including HIV-1 Env, influenza HA, RSV F, hMPV F, MERS-CoV S, Lassa GPC and Ebola GP (11,12,(22)(23)(24)(25)(26).
We next generated combination ("Combo") variants that combined the best-performing substitutions from our initial screen. The Combo variants containing two disulfide bonds generally expressed 2-fold lower than the single-disulfide variants, suggesting that they interfered with each other (table S2). Adding one disulfide (S884C/A893C) to a single proline variant (F817P) also reduced the expression level, although the quaternary structure of the spikes was well maintained (table S2, Combo40). The beneficial effect of a disulfide bond was most prominent when combined with L938F, a cavity-filling variant. Combo23 (S884C/A893C, L938F) had higher protein yields than either of its parental variants, but the Tm of Combo23 did not increase compared to S884C/A893C ( fig. S3). In addition, mixing one cavityfilling substitution with one proline substitution (Combo20) increased the expression compared to L938F alone (table  S2).
Combining multiple proline substitutions resulted in the most substantial increases in expression and stability (Fig. 3A). Combo14, containing A892P and A942P, had a 6.2fold increase in protein yield compared to A892P alone (Fig.  3B). Adding a third proline, A899P (Combo45), increased thermostability (+1.2°C Tm) but did not further increase expression (Fig. 3C). Combo46 (A892P, A899P, F817P) had a 3.4-fold increase in protein yield and a 3.3°C rise in Tm as compared to A892P. The most promising variant, Combo47, renamed HexaPro, contains all four beneficial proline substitutions (F817P, A892P, A899P, A942P) as well as the two proline substitutions in S-2P. HexaPro expressed 9.8-fold higher than S-2P, had ~5°C increase in Tm, and retained the trimeric prefusion conformation (Fig. 3D). We focused on this construct for additional characterization.
To assess the viability of HexaPro as a potential vaccine  S4, G and H). In contrast, S-2P showed signs of aggregation after 3 cycles of freeze-thaw and began unfolding after 30 min at 50°C. Importantly, HexaPro reacted to human convalescent sera and RBD-specific mAb (CR3022) (27) similarly to S-2P, suggesting the antigenicity of HexaPro is well-preserved (Fig. 3E). Collectively, these data indicate that HexaPro is a promising candidate for SARS-CoV-2 vaccine and diagnostic development.
To confirm that the stabilizing substitutions did not lead to any unintended conformational changes, we determined the cryo-EM structure of SARS-CoV-2 S HexaPro. From a single dataset, we were able to obtain highresolution 3D reconstructions for two distinct conformations of S: one with a single RBD in the up conformation and the other with two RBDs in the up conformation. This two-RBD-up conformation was not observed during previous structural characterization of SARS-CoV-2 S-2P (18,19). While it is tempting to speculate that the enhanced stability of S2 in HexaPro allowed us to observe this less stable intermediate, validating this hypothesis will require further investigation. Roughly a third (30.6%) of the particles were in the two-RBD-up conformation, leading to a 3.20 Å reconstruction. The remaining particles were captured in the one-RBD-up conformation, although some flexibility in the position of the receptor-accessible RBD prompted us to remove a subset of one-RBD-up particles that lacked clear density for this domain, resulting in a final set of 85,675 particles that led to a 3.21 Å reconstruction ( Fig. 4A and figs. S5 and  S6). Comparison of our one-RBD-up HexaPro structure with the previously determined 3.46 Å S-2P structure revealed an RMSD of 1.2 Å over 436 Cα atoms in S2 (Fig. 4B). The relatively high resolution of this reconstruction allowed us to confirm that the stabilizing proline substitutions did not distort the S2 subunit conformation (Fig. 4C).
The high yield and enhanced stability of HexaPro should enable industrial production of subunit vaccines and could also improve DNA or mRNA-based vaccines by producing more antigen per nucleic acid molecule, thus improving efficacy at the same dose or maintaining efficacy at lower doses. It is our hope that this work will accelerate the production of prefusion spikes to mitigate the public health emergency and has broad implications for next-generation coronavirus vaccine design.