De novo design of peptides that coassemble into β sheet–based nanofibrils

Description


Discontinuous molecular dynamics (DMD) simulation and PRIME20 model
Our simulation method is discontinuous molecular dynamics (DMD), a fast alternative to traditional molecular dynamics. It is used in conjunction with PRIME20, an implicit-solvent coarsegrained protein force field developed in the Hall group, which is tailored to simulations of peptide aggregation using DMD. In the PRIME20 model, each residue contains three backbone beads NH, Cα, CO and one sidechain sphere R. The parameter matrix between any two of the twenty different sidechain-sidechain interactions include 210 different square well widths and 19 different square well depths to discriminate the polar, charge-charge and hydrophobic types of interactions. The hydrogen bonding interaction between backbone beads NH and CO is modeled as a directional square well potential. All the other non-bonded interactions are modeled as hard sphere potentials. A detailed description of the geometric and energetic parameters of the PRIME20 model is provided in our earlier work 36,39 .

Peptide co-assembly design (PepCAD) algorithm
The Peptide co-assembly design (PepCAD) algorithm is used to de novo design chargecomplementary peptide pairs that co-assemble into particular supramolecular architectures, e.g., a single or multiple -sheet fibril (flat or twisted, parallel or antiparallel, in-register or out-of-register), a -barrel oligomer, or an α-helix bundle ( Supplementary Fig. S2a). A pre-determined molecular architecture, hereafter referred to as the "peptide scaffold", can be of any type, and can contain any number of assembling peptides. Supplementary Fig. S2b shows a flowsheet of the PepCAD algorithm: a Monte Carlo-based search to design co-assembling peptides A and B. The procedure is described briefly here.
(1) Random initial sequences, SA (0) and SB (0) , are draped on the peptide backbone scaffold to generate two peptides, A and B.
(2) A score function, Γ ( ) , for this initial structure is introduced that takes into account the binding affinity between the peptides A and B, as well as the intrinsic aggregation propensities of the individual peptides. Scores with low (negative) values indicate that the evolved peptides have a strong preference for co-assembly and exhibit weak self-assembly behaviors. The score function will be described below.
(3) Three different kinds of trial moves, viz. intra-chain residue mutation, intra-chain residue exchange, and inter-chain residue exchange, are employed in each iteration, i, of the algorithm to mutate the peptide sequences. A random number is called at each iteration i to determine which type of move is employed to generate new sequences, SA (i) and SB (i) .
(4) Energy minimization is performed to optimize the side-chain configurations of the new sequences, strengthening the binding between the new peptides A and B.  peptides is replaced by a new one of the same type (hydrophobic, hydrophilic or charged) (Fig. 2c, top), (ii) intra-chain residue exchange in which two residues of the same type (polar "P" or hydrophobic "H") on all trial A (or B) peptides are exchanged (Fig. 2c, middle), and (iii) inter-chain residue exchange in which a residue of a given type on the A peptides is exchanged with a residue of the same type on the B peptides (Fig. 2c, bottom). New trial peptides A and B are generated no matter which sequence move is called in the algorithm.
The twenty standard amino acids are classified into four residue types according to their hydrophobicity, charge, hydrophilicity and structure (Supplementary Table S1). Through adjusting the number of residues in each type, we can vary the hydration properties of peptides. The CATCH(4+) and CATCH(6-) peptides that are of interest to us each contain three hydrophobic residues ( hydrophobic = 3 ), three hydrophilic residues ( hydrophilic = 3 ), and no other types of residue ( other = 0). The difference between the two CATCH peptides lies in the number of charged residues: four for CATCH(4+) ( charge = 4 ) and six for CATCH(6-) ( charge = 6 ). In this work, our coassembly peptide designs are restricted to hydrophobic = 3 , charge = 5 , hydrophilic = 3 , and other = 0. During the process of sequence evolution, the number of residues in each type is fixed to maintain the evolved peptides at a desired hydration property. In this work, the sequence pattern of our designed peptides is fixed at "PPPHPHPHPPP" (Supplementary Fig. S1a), where "H" refers to hydrophobic amino acids, and "P" includes both charged and hydrophilic amino acids.

Score function
The score function, an essential component of our algorithm, is built to encourage the evolved peptides A and B to co-assemble into amyloid fibrils, and to discourage them from self-assembling when dissolved separately in solution. These two factors are taken into account simultaneously in the algorithm by introducing two energy terms into the score function A too-low value of (, e.g. 0.3, restricts the types of residues in the mutation/exchange moves to certain long-sidechain amino acids that contribute too much binding energy, reducing the diversity of sequence variants. In contrast, a too-high value of (, e.g. 30, makes it hard to ensure the convergence of sequence searches to an energy-minimum state, leading to a too-large fluctuations of the score profiles. Lower negative values of Γ means that peptides A and B are more likely to form fibrillike co-aggregates, but not fibril-like self-aggregates.

Binding free energy
The binding free energy (Δ" # $% $& ) accounts for the difference in the interactions between the pairs of peptides A and B when assembled in the scaffold and that when they are separated, and is defined as (2). The configurational entropy, >? $E , is

The notation
whereis the number of torsional angles of standard amino acid side-chain; G is the gas constant; > is the absolute temperature; det stands for the determinant of a matrix; L is the variancecovariance matrix of torsional fluctuations, whose dimension is related to the number of torsion angles (-) for a specific amino acid. For example, since the amino acid methionine has three torsion angles (-= 3) on its sidechain, the dimension of the matrix L is I-, -M = I3, 3M. The element of the variance-covariance matrix, σ O , is given by where Q denotes the torsion angle of the T /U side-chain (i.e. T = 1, … , -), and 〈… 〉 is an ensemble average over all possible rotamers of this amino acid when repacked during sequence evolution. The covariance σ O reflects the correlation between the side-chain torsions T and W. (3) is the sum of the terms. The binding free energy Δ" # $% $& is defined to be the difference between the free energy of the peptides A and B prior to binding. A lower negative value of ΔG # $% $& indicates a stronger binding strength between peptides A and B. All of the force field parameters in the calculation of Δ" # $% $& come from the AMBER 14SB force field.

Intrinsic self-aggregation propensity
The intrinsic self-aggregation propensity * +&& of a polypeptide containing sites along the chain can be estimated using the Zyggregator method proposed by the Dobson and Vendruscolo groups [43][44][45] : where e .\/ accounts for the presence of alternating hydrophobic-hydrophilic sequence pattern. The e .+/ is 1 if residue T is included in this specific sequence pattern, and 0 otherwise.
The intrinsic aggregation propensity * +&& of the polypeptide in equation (6) is the sum of equations (7)(8)(9)(10)(11). It is worth noting that the sequence pattern term [ .+/ is ignorable throughout the entire evolution process, because the sequence pattern of all evolved peptides is unchanged. A lower * +&& indicates a weaker aggregation propensity of the peptides in solution.

discovered peptide pairs
We performed large-scale DMD/PRIME20 simulations to evaluate the spontaneous aggregation and co-assembled structures of the six best peptide pairs designed in our PepCAD algorithm. In addition, we also performed simulations of the co-assembly of the CATCH peptide pair designed by Seroski et al. 24 All the simulations start with random-coil peptides and are carried out for 5 μs in the canonical (NVT) ensemble. The Andersen thermostat is implemented to maintain the simulation system at a constant temperature. For the peptide co-assembly cases, 100 A and 100 B peptides are initially randomly placed in a cubic box with a length of 321.0 Å, corresponding to a peptide concentration of 10 mM. The reduced temperature is defined to be T * =kBT/εHB, where εHB=12.47 kJ/mol is the hydrogen bonding energy. We set the reduced temperature T * of the simulations to be 0.195, which corresponds to 330 K in real temperature units. The β-sheet content is defined as the percent of residues in the whole system that adopts β-sheet structure, which is calculated using the VMD secondary structure software. For the peptide self-assembly cases, 40 A or 40 B are simulated at the same concentration and temperature as in the co-assembly cases. Each of the peptide systems starts from a random-coil state and is simulated three times.

Transmission electron microscopy
Nanofibers were prepared by mixing and incubating positively charged peptides with negatively

Fourier-transform infrared spectroscopy (FTIR)
The FTIR spectra were recorded using a universal ATR sampling accessory on a Frontier FTIR spectrophotometer (PerkinElmer). Prior to scanning, the FTIR spectrophotometer was blanked with ultrapure water. Samples were prepared at 15 mM and 1x PBS with 4 µl spotted onto the ATR accessory. Each sample was scanned 50 times with the average of the spectra reported.

Solid-state NMR analysis of co-assembled nanofiber samples
For each tested design, nanofiber samples were prepared from equimolar mixtures of peptide A and peptide B at a 10 mM peptide concentration in 1x phosphate-buffered saline (PBS). Samples for Designs 2, 4, and 5 were allowed to incubate for 1 day before centrifugation at 12,100 × g for 5 minutes. Recovered nanofibers are lyophilized prior to packing into 3.2 mm NMR rotors. Finally, samples were minimally rehydrated (1 mg of water per mg of peptide). Due to the lower initial nanofiber yield, the Design 1 mixture was allowed to assemble over 4 days, and nanofibers were recovered by ultracentrifugation directly into the NMR rotor to increase sample yield.
Ultracentrifugation was done at 280 000 × g and 4 °C for 30 min on a Beckman Optima XPN-100 fitted with a SW-41 Ti swinging-bucket rotor and custom-made polycarbonate funnel insert.
The composite-pule multiCP pulse sequence from Duan et al. was implemented to perform quantitative 1H-13C Cross-Polarization Magic Angle Spinning (CPMAS) measurements 55 .
Quantitative CPMAS measurments were run on an 11.75 T Bruker Avance III spectrometer with 100 kHz decoupling and 14 100-μs CP periods to ensure uniform cross polarization. The spinning speed was set to 22 kHz to prevent spectral overlap from spinning sidebands. Reported chemical shifts are relative to tetramethyl silane by calibration with adamantine before each experiment.
Analysis of the chemical shift peaks was done with custom code in Wolfram Mathematica. Peak positions, linewidths, and areas were determined from Lorentzian peak fitting. Chemical shift peak assignments were done by comparison to expected chemical shift peak positions from the BioMagResBank. The ratio of peptide A to peptide B was determined as the ratio of the K γ-carbon (Cγ) peak area to the E δ-carbon (Cδ) peak area adjusted for the number of K and E residues in each sequence.