Structure-based design of antiviral drug candidates targeting the SARS-CoV-2 main protease.

SARS-CoV-2 is the etiological agent responsible for the global COVID-19 outbreak. The main protease (Mpro) of SARS-CoV-2 is a key enzyme that plays a pivotal role in mediating viral replication and transcription. We designed and synthesized two lead compounds (11a and 11b) targeting Mpro Both exhibited excellent inhibitory activity and potent anti-SARS-CoV-2 infection activity. The X-ray crystal structures of SARS-CoV-2 Mpro in complex with 11a or 11b, both determined at 1.5 Å resolution, showed that the aldehyde groups of 11a and 11b are covalently bound to Cys145 of Mpro Both compounds showed good PK properties in vivo, and 11a also exhibited low toxicity, suggesting that these compounds are promising drug candidates.

(Page numbers not final at time of first release) 2 teins) and other accessory proteins (15,16). Therefore, these proteases, especially M pro , play a vital role in the life cycle of coronavirus.
M pro is a three-domain (domains I to III) cysteine protease involved in most maturation cleavage events within the precursor polyprotein (17)(18)(19). Active M pro is a homodimer containing two protomers. The CoV M pro features a noncanonical Cys-His dyad located in the cleft between domains I and II (17)(18)(19). M pro is conserved among CoVs and several common features are shared among the substrates of M pro in different CoVs. The amino acids in substrates from the N terminus to C terminus are numbered as fellows (-P4-P3-P2-P1↓P1′-P2′-P3′-), and the cleavage site is between the P1 and P1′. In particular, a Gln residue is almost always required in the P1 position of the substrates. There is no human homolog of M pro which makes it an ideal antiviral target (20)(21)(22).
The active sites of M pro are highly conserved among all CoV's M pro s and are usually composed of four sites (S1′, S1, S2 and S4) (22). By analyzing the substrate-binding pocket of SARS-CoV M pro (PDB ID: 2H2Z), novel inhibitors targeting the SARS-CoV-2 M pro were designed and synthesized (Fig. 1). The thiol of a cysteine residue in the S1′ sites anchors inhibitors by a covalent linkage that is important for the inhibitors to maintain antiviral activity. In our design of new inhibitors, an aldehyde was selected as a new warhead in P1 in order to form a covalent bond with cysteine. The reported SARS-CoV M pro inhibitors often have an (S)-γlactam ring that occupies the S1 site of M pro , and this ring was expected to be a good choice in P1 (23). Furthermore, the S2 site of coronavirus M pro is usually large enough to accommodate the bigger P2 fragment. To test the importance of different ring systems, a cyclohexyl or 3fluorophenyl were introduced in P2, with the fluorine expected to enhance activity. An indole group was introduced into P3 in order to form new hydrogen bonds with S4 and improve drug-like properties.
The synthetic route and chemical structures of the compounds (11a and 11b) are shown in scheme S1. The starting material (N-Boc-L-glutamic acid dimethyl ester 1) was obtained from commercial suppliers and used without further purification to synthesize the key intermediate 3 according to the literature (24). The intermediates 6a and 6b were synthesized from 4 and acids 5a, 5b. Removal of the t-butoxycarbonyl group from 6a and 6b yielded 7a and 7b. Coupling 7a and 7b with the acid 8 yielded the esters 9a and 9b. The peptidomimetic aldehydes 11a and 11b were approached through a two-step route in which the ester derivatives 9 were first reduced with NaBH4 to generate the primary alcohols 10a and 10b, which were subsequently oxidized into aldehydes 11a and 11b with Dess-Martin Periodinane (DMP).
Recombinant SARS-CoV-2 M pro was expressed and purified from Escherichia coli (E. coli) (18,25). A fluorescently labeled substrate, MCA-AVLQ↓SGFR-Lys (Dnp)-Lys-NH 2 , derived from the N-terminal auto-cleavage sequence from the viral protease was designed and synthesized for the enzymatic assay.
Both 11a and 11b exhibited high SARS-CoV-2 M pro inhibition activity, which reached 100% for 11a and 96% for 11b at 1 µM, respectively. We used a fluorescence resonance energy transfer (FRET)-based cleavage assay to determine the IC 50 values. The results revealed excellent inhibitory potency with IC 50 values of 0.053 ± 0.005 µM and 0.040 ± 0.002 µM, for 11a and 11b respectively (Fig. 2).
In order to elucidate the mechanism of inhibition of SARS-CoV-2 M pro by 11a, we determined the high-resolution crystal structure of this complex at 1.5-Å resolution (table S1). The crystal of M pro -11a belong to the space group C2 and an asymmetric unit contains only one molecule (table S1). Two molecules (designated protomer A and protomer B) associate into a homodimer around a crystallographic 2-fold symmetry axis ( fig. S2). The structure of each protomer contains three domains with the substrate-binding site located in the cleft between domain I and II. At the active site of SARS-CoV-2 M pro , Cys145 and His41 (Cys-His) form a catalytic dyad ( fig. S2).
The electron density map clearly showed compound 11a in the substrate binding pocket of SARS-CoV-2 M pro in an extended conformation ( Fig. 3A and fig. S3, A and B). Details of the interaction are shown in Fig. 3, B and C. The electron density shows that the C of the aldehyde group of 11a and the catalytic site Cys145 of SARS-CoV-2 M pro form a standard 1.8-Å C-S covalent bond. The oxygen atom of the aldehyde group also plays a crucial role in stabilizing the conformations of the inhibitor by forming a 2.9-Å hydrogen bond with the backbone of residues Cys145 in the S1′ site. The (S)-γ-lactam ring of 11a at P1 fits well into the S1 site. The oxygen of the (S)-γ-lactam group forms a 2.7-Å hydrogen bond with the side chain of His163. The main chain of Phe140 and side chain of Glu166 also participate in stabilizing the (S)-γ-lactam ring by forming 3.2-Å and 3.0-Å hydrogen bonds with its NH group, respectively. In addition, the amide bonds on the chain of 11a are hydrogen-bonded with the main chains of His164 (3.2 Å) and Glu166 (2.8 Å), respectively. The cyclohexyl moiety of 11a at P2 deeply inserts into the S2 site, stacking with the imidazole ring of His41. The cyclohexyl group is also surrounded by the side chains of Met49, Tyr54, Met165, Asp187 and Arg188, producing extensive hydrophobic interactions. The indole group of 11a at P3 is exposed to solvent (S4 site) and is stabilized by Glu166 through a 2.6-Å hydrogen bond. The side chains of residues Pro168 and Gln189 interact with the indole group of 11a through hydrophobic interactions. Interestingly, multiple water molecules (named W1-W6) play an important role in binding 11a. W1 interacts with the amide bonds of 11a through a 2.9-Å hydrogen bond, whereas W2-6 form a number of hydrogen bonds with the aldehyde group of 11a and the residues of Asn142, Gly143, Thr26, Thr25, His41 and Cys44, which contributes to stabilizing 11a in the binding pocket.
The crystal structure of SARS-CoV-2 M pro in complex with 11b is very similar to that of the 11a complex and shows a similar inhibitor binding mode (Fig. 3D and figs. S3, C and D, and S4A). The difference in binding mode is most probably due to the 3-fluorophenyl group of 11b at P2. Compared with the cyclohexyl group in 11a, the 3fluorophenyl group undergoes a significant downward rotation (Fig. 3D). The side chains of residues His41, Met49, Met165, Val186, Asp187 and Arg188 interact with this aryl group through hydrophobic interactions and the side chain of Gln189 stabilizes the 3-fluorophenyl group with an additional 3.0-Å hydrogen bond (Fig. 3, E and F). In short, these two crystal structures reveal a similar inhibitory mechanism in which both compounds occupy the substrate-binding pocket and block the enzyme activity of SARS-CoV-2 M pro .
Compared with those of N1, N3 and N9 in SARS-CoV M pro complex structures reported previously, the binding modes of 11a and 11b in SARS-CoV-2 M pro complex structures are similar and the differences among these overall structures are small ( Fig. 4 and fig. S4, B to F) (22). The differences mainly lie in the interactions at S1′, S2 and S4 subsites, possibly due to various sizes of functional groups at corresponding P1′, P2 and P4 sites in the inhibitors (Fig. 4, A and C).
To further substantiate the enzyme inhibition results, we evaluated the ability of these compounds to inhibit SARS-CoV-2 in vitro ( Fig. 5 and fig. S5). As shown in Fig. 5, compounds 11a and 11b exhibited good anti-SARS-CoV-2infection activity in cell culture with EC50 values of 0.53 ± 0.01 µM and 0.72 ± 0.09 µM using plaque-reduction assay, respectively. Neither compound caused significant cytotoxicity, with half cytotoxic concentration (CC 50 ) values of >100 µM, yielding selectivity indices (SI) for 11a and 11b of >189 and >139, respectively. Both immunofluorescence and quantitative real-time PCR were also employed to monitor the antiviral activity of 11a and 11b. The results show 11a and 11b exhibit a good antiviral effect on SARS-CoV-2 ( Fig. 5 and  fig. S5).
To explore the further druggability of the compounds 11a and 11b, both of the compounds were evaluated for their pharmacokinetic (PK) properties. As shown in table S2, compound 11a given intraperitoneally (5 mg/kg) and intravenously (5 mg/kg) displayed a half-life (T1/2) of 4.27 hours and 4.41 hours, respectively, and a high maximal concentration (C max = 2394 ng/mL) and a good bioavailability of 87.8% were observed when the compound 11a was given intraperitoneally. Metabolic stability of 11a in mice was also good (Clearance (CL) = 17.4 mL/min/mg). When administered intraperitoneally (20 mg/kg), subcutaneously (5 mg/kg) and intravenously (5 mg/kg), compound 11b also showed good PK properties (the bioavailability of intraperitoneally and subcutaneously are more than 80%, and a longer T1/2 of 5.21 hours when 11b was given intraperitoneally). Considering the danger of COVID-19, we selected the intravenous drip administration to further study for the reason that value of the area under the curve (AUC) is high and the effect is rapid. Compared with 11a administrated intravenously, the T 1/2 (1.65h) of 11b is shorter and the clearance rate is faster (CL = 20.6 mL/min/mg). Compound 11a was selected for further investigation with intravenous drip dosing in Sprague-Dawley (SD) rats and Beagle dogs. The results showed (table S3) that 11a exhibited long T 1/2 (SD rat, 7.6 hours and Beagle dog, 5.5h), low clearance rate (rat, 4.01 mL/min/kg and dog, 5.8 mL/min/kg) and high AUC value (rat, 41500 hours*ng/mL and dog, 14900 hours*ng/mL)). Those above PK results indicate that compound 11a is worth to warrant further study.
An in vivo toxicity study (table S4) of 11a has been carried out on SD rats and Beagle dogs. The acute toxicity of 11a was measured on SD rats. No SD rats died after receiving 40 mg/kg by intravenous drip administration. When the dosage was raised to 60 mg/kg, one of four SD rats died. The dose range toxicity study of 11a was conducted for seven days at dosing levels of 2, 6, and 18 mg/kg on SD rats and at 10-40 mg/kg on Beagle dogs. All animals received once daily dosing (QD), by intravenous drip, and all animals were clinically observed at least once a day. No obvious toxicity was observed in either group. These above data indicated that 11a is good candidate for further clinical studies. (Page numbers not final at time of first release) 6  The cytotoxicity of these compounds in Vero E6 cells was also determined by using CCK8 assays. The left and right Y-axis of the graphs represent mean % inhibition of virus yield and mean % cytotoxicity of the drugs, respectively. (C and D) Viral RNA copy numbers in the cell supernatants were quantified by qRT-PCR. Data are mean ± SD, n = 3 biological replicates.