Structural analysis of full-length SARS-CoV-2 spike protein from an advanced vaccine candidate

Vaccine efforts against the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) responsible for the current COVID-19 pandemic are focused on SARS-CoV-2 spike glycoprotein, the primary target for neutralizing antibodies. Here, we performed cryo-EM and site-specific glycan analysis of one of the leading subunit vaccine candidates from Novavax based on a full-length spike protein formulated in polysorbate 80 (PS 80) detergent. Our studies reveal a stable prefusion conformation of the spike immunogen with slight differences in the S1 subunit compared to published spike ectodomain structures. Interestingly, we also observed novel interactions between the spike trimers allowing formation of higher order spike complexes. This study confirms the structural integrity of the full-length spike protein immunogen and provides a basis for interpreting immune responses to this multivalent nanoparticle immunogen.

Severe acute respiratory syndrome coronavirus (SARS-CoV) caused a global outbreak from 2002-2003 causing severe pneumonia and killing almost 900 people (1). SARS-CoV-2, belongs to the same lineage of the β-CoV genus as SARS-CoV, and recently emerged in China, spreading rapidly and infecting more than 18 million people worldwide with cases continuing to rise each day (2). Given the global increase in population density, urbanization, and mobility, and the uncertain future behavior of the virus, vaccination is a critical tool for the response to this pandemic. The SARS-CoV-2 spike (S) trimeric glycoprotein is a focus of coronavirus vaccine development since it is a major component of the virus envelope, essential for receptor binding and virus entry, and a major target of host immune defense (3,4). There are several currently ongoing efforts to make spikebased vaccines using different strategies (4)(5)(6).
The CoV S protein is synthesized as an inactive precursor (S0) that gets proteolytically cleaved into S1 and S2 subunits which remain non-covalently linked to form functional prefusion trimers (7). Like other type 1 fusion proteins, the SARS-CoV-2 S prefusion trimer is metastable and undergoes large-scale structural rearrangement from a prefusion to a thermostable post fusion conformation upon S-protein receptor binding and cleavage (8,9). Rearrangement exposes the hydrophobic fusion peptide (FP) allowing insertion into the host cell membrane, facilitating virus/host cell membrane alignment, fusion, and virus entry. Notably, SARS-CoV-2 S has a 4 amino acid insertion (PRRA) in the S1/S2 cleavage site compared to SARS-CoV spike resulting in a polybasic RRAR furin-like cleavage motif that enhances infection of lung cells (10,11). While the S2 subunit is relatively more conserved across the β-CoV genus, the S1 subunit comprising the receptor binding domain (RBD) is immunodominant and much less conserved (12). The FP, two heptad repeats (HR1 and HR2), transmembrane (TM) domain, and cytoplasmic tail (CT) are located in the S2 subdomain that encompasses the fusion machinery. The S1 subunit of SARS-CoV-2 S folds into 4 distinct domains; the Nterminal (NTD), the C-terminal domain (CTD) containing the RBD and two subdomains, SD1 and SD2. While some human CoVs (HCoV), including OC43, exclusively use NTDsialic acid interactions as their receptor engagement, others like Middle East Respiratory Syndrome (MERS) CoV that use the CTD-RBD for primary receptor binding have also been reported to bind sialic acid receptors via their NTD to aid initial attachment to the host cells (13)(14)(15). Although SARS-CoV-2 primarily interacts with its receptor ACE2 through the CTD-RBD, there is currently no evidence indicating possible interactions between the NTD and sialoglycans (16,17).
The structure of the stabilized SARS-CoV-2 spike ectodomain has been solved in its prefusion conformation and exhibits a high resemblance to SARS-CoV spike (17)(18)(19).
In this report, we describe the atomic structure of a leading SARS-CoV-2 S vaccine candidate based on a full-length S gene with furin cleavage-resistant mutations in the S1/S2 cleavage site and the presence or absence of 2-proline amino acid substitutions at the apex of the central helix. Our studies reveal an overall shift in conformation of the S1 subunit compared to the previously published structures (17)(18)(19). Interestingly, we also observed direct interactions between adjacent spike trimers; the flexible loop between residues 615-635 in the SD2 from each trimer extending and engaging a binding pocket on the NTD of the adjacent trimers resulting in higher order spike multimers. Further, sitespecific glycan analysis revealed the glycan occupancy as well as varying levels of glycan processing at the 22 N-glycosylation sequons present in the spike monomer. Thus, our studies provide in-depth structural analysis of the Novavax full-length vaccine candidate, currently being tested in humans, that appropriately recapitulates the prefusion spike.

Design and validation of SARS-CoV-2-3Q-2P full-length spike
The SARS-CoV-2-3Q-2P full-length spike vaccine candidate (3Q-2P-FL) was engineered from the full-length SARS-CoV-2 spike gene (residues 1-1273) including the transmembrane domain (TM) and the cytoplasmic tail (CT) (Fig1a). The construct was modified at the S1/S2 polybasic cleavage site from RRAR to QQAQ to render it protease resistant along with 2 proline substitutions at residues K986 and V987 in the S2 fusion machinery core for enhanced stability ( Figure 1A . CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020. . https://doi.org/10.1101/2020.08.06.234674 doi: bioRxiv preprint

Cryo-EM of SARS-CoV-2-3Q-2P spike
To further evaluate the structural features of the 3Q-2P-FL immunogen, we performed single particle cryo-EM on the spike formulated in PS 80 detergent. The raw micrographs from cryo-EM revealed free trimers and trimer rosettes similar to those observed on the negative stain micrographs (Figure 2A). Initial 2D classifications revealed the presence of 2 distinct classes: free spike trimers and dimers of trimers ( Figure 2A).
Each class was independently subjected to additional classification and refinement.
The three-fold symmetry (C3) reconstruction of the free spike trimer resulted in a map of 3.6 Å resolution while the asymmetric reconstruction (C1) was resolved to 3.8 Å resolution ( Figure 2B and S1A, S1B). Previously published structures of soluble, stabilized SARS-CoV-2 spikes have revealed that RBDs exist in either a closed (RBDdown) or an open (RBD-up) conformation that can engage in ACE2 binding (16)(17)(18). In contrast, we observed that all three RBDs on the 3Q-2P-FL spike trimer were present in the closed conformation in the asymmetric reconstruction; the higher resolution C3 map was consequently used for model building ( Figure 2B and S1C). Overall, the map was well resolved in both S1 and S2 subunits, particularly in the S1 NTD and CTD domains that were less resolved in previously published structures, thereby enabling us to model the full extent of these domains. Notably, the local resolution map calculated using cryoSPARC showed much of the spike trimer at substantially higher resolution than 3.6 Å ( Figure S1D). The atomic model contains residues 14-1146 with breaks only in the flexible loop (619-631) and the cleavage site (678-688) ( Figure 2C). Interestingly, superimposition of the coordinate models of 3Q-2P-FL spike with published spike structures (PDB Id: 6VXX and 6VSB) revealed substantial domain rearrangements in the S1 subunit of 3Q-2P-FL spike compared to the other models, whereas the structure of the S2 subunit was consistent with the published data ( Figure 2D). The S1 NTD differed the most (~14° rotation counterclockwise relative to published models when viewing down towards the viral membrane) while the CTD and subdomains showed minor local rearrangements ( Figure 2D). Notably, we also observed shifts in the placement of residues flanking the 615-635 loop compared to the published models. This region was modeled in one of the published structures (PDB Id: 6X6P) as a helix with residues flanking the helix positioned very differently from our model as shown by the corresponding placement of residues T632 and T618 (residues flanking the gap in 3Q-2P-FL model) ( Figure 3A). However, upon closer inspection of the cryo-EM density (EMD-22078) corresponding to residues 621-640 of the PDB model 6X6P, there is insufficient density to support the helix conformation of this region ( Figure S1E). The resulting displacement of residues in the 3Q-2P-FL structure enables inter-protomeric interactions by creating a salt-bridge between residues Asp 614 and Lys 854 ( Figure 3B). This observation is particularly interesting given the increased prevalence of D614G mutation in the emerging SARS-CoV-2 strains and its potential role in viral transmission and pathogenesis (20).
During refinement of an atomic model into the EM density, we observed 2 additional densities in the S1 subunit that did not correspond to any peptide or glycans within the spike ( Figure S2A). The first density was buried within a hydrophobic pocket of the CTD created by F338, F342, Y365, Y369, F374, F377, F392, F513 ( Figure 3C and S2B). We had previously observed a non-protein density situated in the structure of porcine epidemic diarrhea virus (PEDV) that was identified to be palmitoleic acid (21).
. CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020. . This pocket in SARS-CoV-2 CTD corresponded with the structure of linoleic acid, a polyunsaturated fatty acid; the presence of this ligand was confirmed by mass spectrometry of 3Q-2P-FL spike ( Figure S2B and S2C). The main chain carboxyl group of linoleic acid interacts with R408 and Q409 residues of RBD from the adjacent protomer thereby making interprotomer contacts ( Figure 3C). The second unassigned density present in NTD was relatively larger and more surface exposed than the first density, surrounded by residues N121, Y170, S172, F175, R190, H207, V227 ( Figure 3D and S2D). Analysis of the structural features of this density suggested that it may correspond to PS 80 detergent used to solubilize the membrane-bound trimers and stabilize them in solution. The aliphatic tail of PS 80 fit well into the hydrophobic pocket while the carbonyl and hydroxyl groups were well placed in proximity to residues R190 and H207 with potential for multiple hydrogen bonds between them ( Figure 3D and S2D). Overall, the density is consistent with PS 80 detergent and, given its location, provides a possible explanation for the S1 shift seen in our FL trimer density compared to the published structures.

Structures of dimer-of-trimers and trimer-of-trimers
Further classification of multimeric trimer particles yielded two separate classes; a dimerof-trimers class that reconstructed to a final resolution of 4.5 Å with 2-fold symmetry and a trimer-of-trimers class that was resolved to 8.0 Å resolution ( Figure 4A, 4B and S3A).
The presence of the trimer-of-trimers class revealed that each spike trimer had the ability to interact with multiple trimers simultaneously. In both reconstructions, the interaction between each pair of trimers involved the SD2 of one protomer from each trimer engaging . CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020. . https://doi.org/10.1101/2020.08.06.234674 doi: bioRxiv preprint with the NTD of the adjacent trimer ( Figure 4C). Consequently, each trimer pair is symmetrical along a 2-fold axis with trimer axes tilted to 44.5 degrees relative to each other. The atomic model of the dimer-of-trimer EM density revealed that the interaction was mainly coordinated by the 615-635 loop. Although most of the loop residues were too flexible to resolve in the free trimer density map, the inter-trimer interaction stabilized the loop so that it could be fully resolved ( Figure 4D). The loop reaches into a pocket on the adjacent NTD, interacting with residues 621-PVAIHADQ-628 in the loop with NTD residues Q183, H146, Y248, L249, V70 and S71 ( Figure 4D). We observed subtle changes in the NTD binding pocket in the loop-bound state compared to the free trimer model that allow better accommodation of the loop in the pocket. The residues Y145 and H146 in the binding pocket appear to switch positions in the loop-bound state resulting in a salt bridge interaction between H146 and D627 and potential stacking between W152 and H146 ( Figure 4E). We also observed minor displacement of residues 68-75 and 248-250 surrounding the pocket. In addition to the main loop interaction resulting in higherorder oligomers, we also observed N282 glycans extending out towards the symmetry related chain in the adjacent trimer ( Figure S3B). Parts of the glycans are stabilized and visible in the dimer-of-trimers map. Due to the close proximity, these glycans might form hydrogen bonds with the symmetry related chain but it is unclear if these glycan interactions aid in the stability of the dimer-of-trimers.
To investigate if the residues involved in trimer-trimer interactions are conserved across CoV strains belonging to lineage B of betacoronaviruses, we performed sequence alignment of residues in the loop and corresponding NTD binding pocket across representative strains ( Figure S3C). While the loop residues 621-PVAIHADQ-628 are well . CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020.  Figure S3E).
. CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020. . https://doi.org/10.1101/2020.08.06.234674 doi: bioRxiv preprint To evaluate if the multimerization phenomenon observed in the full-length spike construct played a role in virus replication, we performed pseudovirus replication assays with SARS-CoV-2 wild-type (WT) spike and two mutant spikes. In mutant 1, the loop residues 621-PVAIHADQ-628 were replaced with a glycine-serine linker to completely knockout binding to NTD and in mutant 2, residues 619-EVPV-622 of SARS-CoV-2 were reverted to residues 619-DVST-622 of SARS-CoV-1. Pseudoviruses containing either WT or mutant spikes were generated in HEK293T cells and used to infect HeLa or HeLa-ACE2 cells. While the WT and mutant 2 exhibited similar levels of infection, we observed no detectable levels of infection for mutant 1 in which all contact residues on the loop were replaced by a GS linker ( Figure 4F).

Structural comparison to SARS-CoV-2-3Q-FL spike
To investigate if the absence of 2 stabilizing proline mutations impacted the spike stability and formation of higher order multimers, we performed cryo-EM studies of the SARS-CoV-2-3Q-FL (without 2P) protein formulated in PS 80 detergent. The raw micrographs and 2D classes revealed the presence of free trimers as well as trimer-trimer complexes as observed with 3Q-2P-FL, indicating that the proline stabilization is not necessary for the formation of these higher order complexes ( Figure S4A). The 3D refinement of free trimers was refined to 4 Å resolution imposing C3 symmetry as we observed that the RBDs were present in closed conformation similar to 3Q-2P-FL ( Figure S4B). Fitting the 3Q-2P-FL model into the 3Q-FL map revealed identical conformation of the spike protein further supporting that the presence of 2P in the full-length immunogen does not lead to any structural changes in the spike protein ( Figure S4C).
. CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020. .

Glycan occupancy and glycan processing
Glycans on viral glycoproteins play a wide role in protein folding, stability, immune recognition and potentially in immune evasion. Site-specific glycosylation of the SARS-CoV-2 prefusion spike protein produced in SF9 insect cells was analyzed using our recently described mass spectrometry proteomics-based method, involving treatment with proteases followed by sequential treatment with the endoglycosidases (Endo H and PNGase F) to introduce mass signatures in peptides with N-linked sequons (Asn-X-Thr/Ser) to assess the extent of glycosylation and the degree of glycan processing from high mannose/hybrid type to complex type (24). Although the method was developed to assess the degree of processing of N-linked glycans in mammalian cells, it is also applicable for analyzing glycosylation of SF9 insect cells. The primary differences in glycan processing of N-linked glycans in SF9 insect cells are: 1) the production of truncated paucimannose glycans, and 2) the potential to introduce either one (α1,6) or two (α1,6/α1,3) fucose substitutions into the core GlcNAc attached to Asn. Although α1,3 fucose substitution is known to prevent cleavage by PNGase F (25), this is not a factor when analyzing glycosylation from SF9 cells since they contain α1,6-fucosylatransferase, which is found in mammalian cells, but only contain trace amounts α-1,3fucosyltransferase activity, if any (26). The paucimannose glycans are highly processed like complex type glycans and not cleaved by EndoH, but are cleaved by PNGase F. Thus, for SF9 insect cell-produced glycoproteins, the use of endoglycosidases to introduce mass signatures is analogous to analysis of glycoproteins produced in mammalian cells, with EndoH removing high mannose/hybrid glycans leaving a GlcNAc-. CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020. . https://doi.org/10.1101/2020.08.06.234674 doi: bioRxiv preprint Asn (+203), followed by treatment with PNGase F in O18 water which removes the remaining paucimannose and complex type glycans and while converting Asn to Asp (+3), and the Asn of unoccupied sites remains unaltered (+0).
Our analysis detected glycosylation at all 22 potential N-linked glycan sequons present on SARS-CoV-2 spike ( Figure 4G). Overall, there was high glycan occupancy of over >98%, with only two sites, 603 and 657, more than 5% unoccupied. Interestingly, we . CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020. . https://doi.org/10.1101/2020.08.06.234674 doi: bioRxiv preprint

Discussion
The coronavirus disease (COVID-19) caused by SARS-CoV-2 poses a serious health threat and was declared a pandemic by the World Health Organization (WHO). In quick response to this rapidly evolving situation, several SARS-CoV-2 spike-based vaccine candidates are being developed and tested at various stages of clinical trials (4)(5)(6). In this study, we performed structural analysis of the Novavax SARS-CoV Structural analysis of the 3Q-2P full-length spike immunogen revealed several important findings. The first being the stabilization and shift of the S1 subunit compared to published structures (17,18). Although the cause of this shift is unclear, the presence of PS 80 wedged in the NTD potentially may stabilize the alternate conformation. Notably, another recent study observed differences in their NTD conformations as a function of pH (32). We also observed intertrimer interactions between SARS-CoV-2 full-length spike proteins for the first time. The 615-635 loop that is generally disordered in free trimers, engages the NTD of the adjacent trimer in a well-ordered conformation. The binding . CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020. . https://doi.org/10.1101/2020.08.06.234674 doi: bioRxiv preprint pocket on NTD is present adjacent to the putative glycan binding site and shows subtle differences in the apo and loop-bound conformations. The slight displacement of loop residues 68-75 and the reversed positions of Y145 and H146 in the bound state not only allow better accommodation of the loop in the pocket, but also enable the formation of salt bridge interactions between H146 and D627. Importantly, both these findings were seen in the full-length spike immunogen assembled into compact and dense nanoparticles, which may play a role in both the observed S1 shift as well as formation of higher order spike multimers. Cryo-electron tomographic reconstructions of intact SARS-CoV-2 virions showed a relatively dispersed distribution of spike protein trimers on the viral surface and no evidence of higher order aggregates (33). However, another study has shown that the D614G mutation present in close proximity to the dimerization loop results in a several fold increase of spike numbers on the viral surface, resulting in higher spike protein density and a more infectious virion (20). The greater density may be aided by the ability to form such higher order multimers and may also serve to block access to epitopes on the more conserved S2 component of the spike, thereby facilitating immune evasion. Alternatively, loop that mediates inter-spike interactions may play a role in viral viability, consistent with our data showing that replacement of the loop with a GS linker completely abrogated viral infectivity.
We also observed two non-spike densities within the spike trimer that corresponded with linoleic acid and polysorbate 80 detergent. Linoleic acid, an essential free fatty acid, was buried within a hydrophobic pocket in the CTD with its main chain carboxyl group making contacts with the adjacent RBD in closed conformation. A recent report by Toelzer et al. also identified this density and attributed it to the presence of . CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020. . https://doi.org/10.1101/2020.08.06.234674 doi: bioRxiv preprint linoleic acid (34). The second large density occupied by PS 80 is situated in the NTD and is relatively more surface exposed. Since PS 80 is unique to the formulation of the Novavax 3Q-2P-FL immunogen, this observation is specific to this structure. However, there is a possibility of other ligands occupying this pocket in the place of PS 80. The presence of these binding pockets for different ligands in the spike structure provide potential targets for drug design against SARS-CoV-2.
The widely used SARS-CoV-2 spike ectodomain construct with mutated cleavage site and 2P substitution has been shown to partially exist in all RBD 'down' conformation or in one RBD 'up' conformation (17,18). Surprisingly, we observed that all the RBDs in the 3Q-2P-FL spike immunogen were present in a down confirmation, which could be a cause for concern for eliciting neutralizing antibodies that compete with ACE2 binding.
However, binding analysis of the 3Q-2P-FL immunogen to ACE2 by both bio-layer interferometry and ELISA clearly show binding to ACE2, indicating that the RBD is Our structural work is consistent with the burgeoning body of structures available of the spike protein, albeit with the important differences described above. Hence, this advanced protein subunit vaccine candidate currently being tested in humans appears stable, homogeneous, and locked in the antigenically preferred prefusion conformation.

ACKNOWLEDGMENTS
We thank Bill Anderson, Hannah L. Turner and Charles A. Bowman for their help with electron microscopy, data acquisition and data processing. We thank Bill Webb and Linh Truc Hoang for their assistance with mass spectrometry and data processing. We thank Lauren Holden for her assistance with the manuscript. Authors would also like to thank

Materials and Methods
Figures S1 to S4 Table S1 References (41-58) . CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020.

Materials and Methods Supplementary Text
Figs. S1 to S4 Table S1 . CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020. . CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020. . https://doi.org/10.1101/2020.08.06.234674 doi: bioRxiv preprint Ns-EM sample preparation and data collection. Equal concentrations of SARS-CoV-2-3Q-2P full-length spike formulated in PS80 and Matrix adjuvant were diluted to approximately 20 µg/mL with TBS. The sample was directly deposited onto carbon-coated 400-mesh copper grids and stained immediately with 2% (w/v) uranyl formate for 90 seconds. Grids were imaged at 120 KeV on Tecnai T12 Spirit with a 4k x 4k Eagle CCD camera at 52,000x magnification and -1.5 μm nominal defocus. Micrographs were collected using Leginon and the images were transferred to Appion for processing (41,42). Particle stacks were generated in Appion with particles picked using a difference-of-Gaussians picker (DoG-picker) and 2D classes generated by MSA/MRA (43,44). The copyright holder for this preprint this version posted August 6, 2020. . https://doi.org/10.1101/2020.08.06.234674 doi: bioRxiv preprint to -2.3 µm for 3Q spike (45). MotionCor2 was used for alignment and dose weighting of the frames (46). Micrographs were transferred to CryoSPARC 2.9 for further processing (47). CTF estimations were performed using GCTF and micrographs were selected using the Curate Exposures tool in CryoSPARC based on their CTF resolution estimates (cutoff 5 Å) for downstream particle picking, extraction and iterative rounds of 2D classification and selection (48). Particles selected from 2D classes were used for 3D refinement of free trimers for 3Q-2P-FL and 3Q-FL datasets in CryoSPARC. Final subsets of clean trimer particles were refined with C3 symmetry and local resolution for the free trimer was calculated using the local resolution function in CryoSPARC. Particles corresponding to dimers-of-trimers classes in CryoSPARC were transferred to Relion 3.0 for iterative rounds of 3D classification to separate dimers-of-trimers and trimers-of-trimers (49). Final subsets of clean particles from dimers-of-trimers class were refined with C2 symmetry and the trimers-of-trimers class with C1 symmetry.

Cryo-EM sample preparation. For SARS-CoV
Model building and refinement. The 3.6 Å C3-symmetric free trimer map and the 4.5 Å C2-symmetric dimers-of-trimers maps were used for model building and refinement. Initial model building was performed manually in Coot using PDB 6VXX as a template followed by iterative rounds of Rosetta relaxed refinement and Coot manual refinement to generate the final models (50,51). EMRinger and MolProbity were run following each round of Rosetta refinement to evaluate and choose the best refined models (52,53) The coordinates were manually placed and refined into the respective map densities using Coot. For Rosetta refinement, each ligand was saved in MOL2 format and Rosetta parameter files were generated using the molfile_to_params.py function (51). Final map and model statistics are summarized in Table S1. Figures were generated using UCSF Chimera and UCSF Chimera X (55,56).
Mass-spectrometry. Mass spectrometry to identify fatty acids in the SARS-3Q-2P-FL protein was performed as described previously (21). We obtained several candidates in this screen that were narrowed down to 6 candidates based on their intensity and the m/z range of 250-300.

Site-specific glycosylation
A sample of the SARS-CoV-2 prefusion spike protein expressed in the SF9 insect cell line was prepared for MS analysis as previously described with minor modifications (24).
In brief, the protein (50 µg) was denatured and aliquots (10 µg) were digested under five different protease conditions including chymotrypsin, a combination of trypsin and chymotrypsin, trypsin, elastase and subtilisin as described. All samples were then pooled and deglycosylated by Endo H followed by PNGase F in O18-water. To obtain full site coverage, an additional aliquot of the denatured protein (10 µg) sample was digested with chymotrypsin (1:13 w/w) only and deglycosylated with EndoH and PNGase F like the other samples.
The combined protease-treated and chymotrypsin only samples were separately analyzed on an Q Exactive HF-X mass spectrometer (Thermo). Each sample was run twice as replicate. Samples were injected directly onto a 25 cm, 100 μm ID column packed . CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020. . https://doi.org/10.1101/2020.08.06.234674 doi: bioRxiv preprint with BEH 1.7 μm C18 resin (Waters). Samples were separated at a flow rate of 300 nL/min on a nLC 1200 (Thermo). Solutions A and B were 0.1% formic acid in 5% and 80% acetonitrile, respectively. A gradient of 1-25% B over 160 min, an increase to 40% B over 40 min, an increase to 90% B over another 10 min and held at 90% B for 30 min was used for a 240 min total run time. Column was re-equilibrated with solution A prior to the injection of sample. Peptides were eluted directly from the tip of the column and nanosprayed directly into the mass spectrometer by application of 2.8 kV voltage at the back of the column. The HFX was operated in a data dependent mode. Full MS1 scans were collected in the Orbitrap at 120k resolution. The ten most abundant ions per scan were selected for HCD MS/MS at 25NCE. Dynamic exclusion was enabled with exclusion duration of 10 s and singly charged ions were excluded.
The MS data were processed essentially as described previously (24). The data were searched against the proteome database and quantified using peak area in Integrated Proteomics Pipeline-IP2. Since the processing pathway in SF9 cell line (insect cell line) is similar to mammalian cells for oligomannose and hybrid structures cleaved by Endo-H, and then diverges to produce a combination of paucimannose and complex type glycans, peptides with N+203 were identified as having oligomannose type glycans, and peptides with N+3 are assigned as peptides with complex and paucimannose type glycans.  . CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020. . https://doi.org/10.1101/2020.08.06.234674 doi: bioRxiv preprint . CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020. .

Figure S3. Cryo-EM validation and analysis of SARS-CoV-2 3Q-2P-FL spike dimersof-trimers. (A)
FSC curve for 3Q-2P-FL dimers-of-trimers map imposing C2 symmetry. (B) N282 glycans extending out from each trimer towards the symmetry related chain in the adjacent trimer. The adjacent spike trimers are shown in pink and blue as ribbon representation and their corresponding cryo-EM density shown in transparent gray as surface representation. (C) Alignment of spike sequences from representative lineage B beta-CoV strains performed using Clustal Omega. The loop residues 621-PVAIHADQ-628 are highlighted by a yellow box, the D614 residue highlighted by a blue box, the loops surrounding the NTD binding pocket are highlighted by a coral box and the potential interacting residues are underlined in black. (D) Surface representation of MERS spike (PDB ID: 6Q04) in tan color bound to 5-N-acetyl neuraminic acid shown in yellow. The binding site is colored in cyan. (E) Interaction between the protomers of adjacent trimers in the 3Q-2P-FL dimers-of-trimers model. One protomer is shown as a ribbon diagram in blue while its binding partner is shown as surface representation in gray. Residues 621-PVAIHADQ-628 on the loop with potential interactions are colored yellow and the corresponding residues in the NTD binding pocket are highlighted in coral. Spike residues predicted in glycan binding are colored in cyan.
. CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020. . https://doi.org/10.1101/2020.08.06.234674 doi: bioRxiv preprint . CC-BY 4.0 International license (which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint this version posted August 6, 2020. . https://doi.org/10.1101/2020.08.06.234674 doi: bioRxiv preprint