Directed-evolution mutations enhance DNA-binding affinity and protein stability of the adenine base editor ABE8e

MD simulations of full-length ABEs

It is well established that enzymatic activity is regulated by many factors, including conformational dynamics and substrate-binding affinity [45,46,47,48]. Additionally, enzymatic stability also plays a key role in preserving enzymatic activity under varying environmental conditions [49]. To investigate the effects of the eight directed mutations on the conformational dynamics of TadA8e of ABE8e and its interaction with DNA, we first constructed two simulation systems for the real functional forms of ABE8e and ABE7.10 with additional, dimerized TadAs. The systems were designated as TadA8e: TadA8e‒Cas9n and TadA7.10:TadA7.10‒Cas9n, respectively (Fig. 1C, right panel). For comparison, two corresponding, virtual ABE systems without the dimerized TadAs were also constructed: TadA8e‒Cas9n and TadA7.10‒Cas9n, respectively (Fig. 1D).

To construct these systems, we first used the cryo-EM structure (PDB ID: 6VPC) as the protein template to build the full-length atomic model of TadA8e:TadA8e‒Cas9n. However, the 32-aa peptide linker between Cas9n and TadA8e and several NTS bases are missing in the cryo-EM structure. To complete the coordinates of these atoms, we compared the structure with all available Cas9‒sgRNA‒DNA ternary complexes in PDB to identify the active SpCas9 structures whose sgRNA‒DNA conformations are similar to that of 6VPC [50, 51]. Consequently, the NTS of the most similar structure (PDB ID: 5F9R) was employed as the template to complete the missing NTS bases in 6VPC (Fig. S1A). We next constructed the 32-aa peptide linker using SWISS-MODEL (https://swissmodel.expasy.org) and 6VPC as the protein template (Fig. S1B) and thus obtained the full-length model consisting of sgRNA, DNA, TadA8e: TadA8e‒Cas9n (Fig. S1C). It should be noted that one model was also predicted directly by AlphaFold2 with the amino acid sequence (Fig. S1B).However, this model was found to be significantly different from 6VPC and therefore not used. We introduced point mutations into the two TadA8e structures of TadA8e: TadA8e‒Cas9n using the mutagenesis plugin in PyMOL and then generated the full-length model for TadA7.10:TadA7.10‒Cas9n. By removing their dimerized, the models for TadA7.10‒Cas9n and TadA8e‒Cas9n were constructed. Finally, energy minimization was performed to refine the atomic coordinates of all four models.

Using the above full-length models, four simulation systems were constructed as described in Materials and Methods. Each system comprised approximately 630,000 atoms. To obtain equilibrated structures for the subsequent analyses, for each system all-atom MD simulations of approximately 500 ns were performed. To assess whether the systems were equilibrated, we calculated root mean square deviations (RMSDs) of the ABE backbone atoms using their initial structures as the references. As shown in Fig. S2, the RMSDs of the four systems showed no significant changes after about 150-ns simulations, indicating that the simulated ABE complexes were equilibrated. Therefore, the MD complex structures of the four systems after the 150-ns simulations were used for the analyses.

The structure of TadA8e is more stable than that of TadA7.10

As shown in Fig. S1C, only the first TadA that was fused to Cas9n is responsible for the deamination, whereas the second, dimerized TadA does not directly interact with the DNA substrate [23]. Therefore, we mainly analysed the conformational dynamics of the Cas9n-fused TadA and its interactions with the DNA substrate in TadA8e: TadA8e‒Cas9n or TadA7.10:TadA7.10‒Cas9n. First, we investigated the conformational stability of the fused TadA7.10 and TadA8e by projecting their free energy landscapes onto two components: RMSD and radius of gyration (Rg). As shown in Fig. 2A, the fused TadA7.10 in TadA7.10:TadA7.10‒Cas9n possesses three deep energy wells, suggesting that TadA7.10 contains multiple dominant conformations. In contrast, the fused TadA8e in TaA8e: TadA8e‒Cas9n has only one deep well, suggesting that TadA8e has a single, more stable conformation than the fused TadA7.10. This suggests that the eight directed-evolution mutations enhance the conformational stability of TadA8e.

Fig. 2figure 2

The conformational states of TadA7.10 and TadA8e in the MD simulations. (A) Free energy landscapes of TadAs projected onto RMSD of the protein backbone atoms and radius of gyration (Rg), where kB is the Boltzmann constant and T is the simulation temperature (310 K). (B) Time-dependent distances and corresponding distributions of the Cα atom of TadA E59 at the active site to the C4’ atom of the substrate A26. (C) The initial and equilibrium conformations of TadA. The distances from Cα of TadA E59 to C4’ of the substrate A26 (blue dotted lines)

Similar to dimeric TadA8e, monomeric TadA8e in TadA8e‒Cas9n also has one deep well of -14.1 kBT, which is comparable to that of dimeric TadA8e (~ -14.8 kBT), indicating that the conformational stabilities of TadA8e are similar in both monomeric and dimeric forms (Fig. S3). In contrast, monomeric TadA7.10 in TadA7.10‒Cas9n has two deep wells of about -12.1 kBT, higher than that (-14.5 kBT) in the dimeric form, suggesting that the second TadA7.10 stabilises the first TadA7.10 fused to Cas9n. Again, the monomeric TadA7.10 without the eight directed-evolution mutations is a more flexible compared to TadA8e.

To explore the effects of the TadA structural stability on substrate recruitment, we calculated the distances between the residue E59 in the active sites of TadAs and base A26 of the DNA substrate (Fig. 2B). In TadA8e‒Cas9n, TadA8e: TadA8e‒Cas9n and TadA7.10:TadA7.10‒Cas9n, the average distances from the Cα atom of E59 to the C’4 atom of A26 were very similar at ~ 14.1 Å, ~ 15.7 Å and ~ 14.9 Å, respectively (Fig. 2B-C). However, the corresponding distance in TadA7.10‒Cas9n was up to 19.2 Å and significantly larger than that in TadA7.10:TadA7.10‒Cas9n (Fig. 2B-C). This suggests that the structural stability of TadAs is important for their binding to the DNA substrate, and that the dimeric form is required to stabilise the fused TadA7.10 for its tighter binding to the DNA substrate.

TadA8e has a higher binding affinity for DNA than TadA7.10

To test the above hypothesis, we used the g_mmpbsa program [40] to calculate the energies for the TadA binding to the substrate DNA in the four systems. It is well known that the relative dielectric constant of the solute is a key parameter in the MM/PBSA calculation of g_mmpbsa [40, 43]. Considering that the investigated ABEs are protein-DNA systems, to rationalise the comparison, we used four different solute dielectric constants (i.e., 2, 4, 6 and 8) in the calculations, where the minimum 2 is commonly used for low-polarised proteins [43, 52], and the maximum 8 is the DNA dielectric constant [53]. The calculated binding energies of TadA to the ssDNA substrate (i.e., non-target strand, NTS) of the four simulated systems are shown in Fig. S4.

Given that the MM/PBSA calculation is well suited to ranking binding affinities that depend on relative values [43], we set the absolute energy value of TadA‒Cas9n as a unit of 1.0 and then converted the energies of the other systems in Fig. S4 into energy ratios of this unit. Using the dielectric constant 6 as an example, for TadA7.10 or TadA8e the energy ratios indicate that the binding affinities of TadA: TadA‒Cas9n and TadA‒Cas9n are comparable (Fig. 3A), suggesting that dimerization does not significantly affect the substrate binding. However, when comparing TadA7.10 with TadA8e, we found that the binding energies of TadA8e are 1.9 ~ 8.9 times those of TadA7.10 at all four dielectric constants (Fig. S4), strongly supporting that TadA8e has a higher DNA binding affinity for the DNA substrate than TadA7.10. Thus, TadA8e appears to be able to bind the DNA substrate in a monomeric form. Again, the eight-direction mutations also enhance the DNA binding capability of TadA8e, thereby enabling it to bind firmly to the substrate.

Fig. 3figure 3

The binding energies and surface electrostatic potentials of the TadA-DNA complexes. (A) The binding energy ratios of TadA to DNA substrate, using the absolute values of TadA-Cas9n as the unit of 1. Data are presented as the mean ± SD. (B) The energy terms of TadA with the DNA substrate using the absolute values of the VDW energy of TadA7.10:TadA7.10 as the unit of 1. Data are presented as mean ± SD. (C) The surface electrostatic potentials of TadA calculated by APBS, on a scale from − 3 to 3 kBT/ec. Red represents negative electrostatic potential, blue represents positive electrostatic potential, and white is neutral

To determine the main forces driving the TadA8e binding to DNA, we further analysed the components of the binding energies. As shown in Fig. 3B and S5A, the binding is mainly driven by the electrostatic attractions. The improvement in the binding energy of TadA8e is attributed to the enhanced electrostatic attractions. In order to elucidate the structural basis for the difference in electrostatic interactions between TadA8e and TadA7.10, we calculated their surface electrostatic potentials using the APBS program [28].

As illustrated in Fig. 3C and S5B, the surface potentials indicate that the DNA binding region of TadA8e exhibits a higher positive charge density than that of TadA7.10. In TadA7.10, the loop comprising residues 157‒167 is far from the active site, whereas in TadA8e it is closer to the active site. The results in the segment of R153‒K161 upstream of this loop forming a continuous positively charged surface with the region around the active site, thereby increasing the positively charged area in the DNA binding region. Compared to TadA7.10, the electrostatic potential in the DNA binding region of TadA8e is more attractive to DNA, thereby increasing the binding affinity.

Directed-evolution mutations increase positive charge density of DNA binding region

Given that TadA8e was obtained by mutating eight residues of TadA7.10, it can be reasonably assumed that the higher DNA-binding affinity of TadA8e is a consequence of these mutations. In order to quantitatively characterise the contributions of the mutations to the DNA binding, we calculated the binding energy difference (ΔΔG) for each amino acid between TadA7.10 and TadA8e. As an illustrative example, the results of the dielectric constant of 6 are presented in Fig. 4A and Fig. S6A. For the first TadA fused to Cas9n, T111R, D119N and D167N are the mutations with the most significant energy contributions (ΔΔG < -20 kcal/mol). Once again, electrostatic interactions are the main component of the energy contributions. Obviously, the mutations from D to N at both residues 119 and 167 eliminate the electrostatic repulsions to DNA; and the positively charged residue R111 could directly increase the electrostatic attractions to DNA (Table 1).

Fig. 4figure 4

Identification of key mutations for the DNA binding. (A) ΔΔG per residue between TadA7.10 and TadA8e in the TadA: TadA‒Cas9n systems. (B) The surface electrostatic potentials of the mutations. Mutation residues are shown as yellow sticks. Red represents negative electrostatic potential, blue represents positive electrostatic potential, and white is neutral. The distances from the Cα atom of residue 119 to the Cα atom of residue 167 are shown as green dotted lines. (C) Time-dependent distances and corresponding distributions from the Cα atom of residue 119 to the Cα atom of residue 167

Table 1 Energy contributions of the eight directed-evolution mutations to the NTS binding (kcal⋅mol− 1)

Similarly, for the second TadA in dimerization with the first, T111R and D119N also have the largest contributions (Fig. 4A). Somewhat differently, N119 in TadA8e contributes the most, probably due to the trans-dimerization of two TadA8e proteins, with the second N119 remaining close to N119 of the first TadA8e (Fig. S6B). Consequently, N119 in the second TadA8e is also in close proximity to the DNA substrate. And the mutation from D to N is beneficial in eliminating the repulsive electrostatic interactions. In contrast, R111 in the second TadA8e is slightly away from the DNA substrate and its energy contribution is slightly lower than that of the first TadA (Fig. S6B). It is likely that these mutations in both TadA8e proteins could enhance the DNA binding.

We next examined the effects of the mutations on the surface distributions of the electrostatic potentials of TadA8e. As shown in Fig. 4B, R111 in TadA8e increases the positively charged area within the DNA binding region, highlighting its key role in DNA binding. In TadA7.10, electrostatic repulsion between D119 and D167 results in the peptide of R153–K161 moving away from the active site, thereby disrupting the continuous surface of positive potentials (Fig. 4B). In TadA8e, D119 and D167 have been mutated to N, and the repulsions between them are eliminated, allowing N167 to approach N119. This then causes the peptide of R153–K161 and R111 to form a continuous surface of positive potentials, expanding the positively charged area in the DNA-binding region of TadA8e.

To further confirm that the distance from residues 119 to 167 in TadA7.10 is greater than that in TadA8e, we calculated their distances in TadA7.10 and TadA8e, respectively. As anticipated, the distances between residues 119 and 167 in the two simulated systems of TadA7.10 were 27.8 Å and 30.3 Å, respectively, which are considerably larger than those in corresponding systems of TadA8e (14.6 Å and 17.6 Å, respectively) (Fig. 4C), indicating a close correlation between these two residues in TadA8e.

Experimental verifications of key mutations on binding and protein stability

To verify the key roles of T111R, D119N and D167N in enhancing the DNA-binding affinity, we constructed a TadA7.10 mutant with R111/N119/N167 and designated it as TadA7.10-3mut. Subsequently, the proteins of TadA7.10 and TadA7.10-3mut were expressed and purified, microscale thermophoresis (MST) measurements were performed to determine their DNA-binding affinities. Given that TadA acts only on ssDNA [23], to test whether the NTS in the structure 6VPC is a single-stranded DNA, we first predicted its secondary structure via the RNAStructure website (http://rna.urmc.rochester.edu/RNAstructureWeb/) [54], and found that the NTS may form a base-paired structure (Fig. S7A). So, five bases were mutated at either end of the NTS and obtained a 19-nt ssDNA (TTCTCTTCCACTTTCTTTT) as the substrate used in the MST measurements. The error between the fluorescence intensities of all capillaries was found to be no greater than 10%, with a range of 1200–1320 (Fig. S7B). Data points outside this range were excluded.

As shown in Fig. 5A, the MST results indicated that the equilibrium dissociation constants (Kd) of TadA7.10 and TadA7.10-3mut were 106.1 and 38.5 nM, respectively. This suggests that the DNA-binding affinity of TadA7.10-3mut is approximately 2.8 times that of TadA7.10, which provides strong evidence that the residues R111/N119/N167 in TadA8e could increase its DNA-binding affinity. To further assess the DNA-binding effects of R111, N119 and N167 on editing activity, we performed an in vivo reversion mutation experiment. Four ABE8e variants were constructed as follows: ABE8e-R111T, ABE8e-N119D, ABE8e-N167D and ABE8e-N119D-N167D. Then, we used the generated mutants to target three gene loci in HEK293T cells, after which Sanger sequencing and EditR analysis were performed [55].

Fig. 5figure 5

Experimental verification of the key mutations. (A) MST measurements for TadA7.10 and TadA7.10-3mut (R111/N119/N167). The Kd values of TadA7.10 and TadA7.10-3mut are 106.1 and 38.5 nM, respectively. Data are presented as mean ± SD of three independent experiments. (B) Base editing efficiencies for ABE8e, ABE8e-R111T, ABE8e-N119D, ABE8e-N167D, ABE8e-R111T-N119D and ABE8e-D147Y at three genomic sites in HEK293T cells. The targets As and Ts are shown in red, with a subscripted number indicating their relative position to the PAM (NGG PAM is counted as + 21 to + 23), and the PAM sequence is shown in blue. Editing efficiencies were analyzed by Sanger sequencing and EditR calculation. For all plots, dots represent individual biological replicates and bars represent the mean ± SD of three independent biological replicates

The reversion mutation results indicated a significant reduction in the activity of ABE8e-R111T (Fig. 5B), consistent with previous studies that its deamination rate is comparable to that of ABE7.10 [23]. Meanwhile, the activity of ABE8e-N167D is almost the same, and that of ABE8e-N119D decreased slightly (Fig. 5B). However, when both N119 and N167 were mutated back to those in ABE7.10, the activity of ABE8e-N119D-N167D decreased dramatically (Fig. 5B). These results suggest a coupled effect between these two residues, in agreement with our calculations. We also tested the effect of D147Y and found that the activity of ABE8e-D147Y was only minimally affected (Fig. 5B). Although the mutation form Y147 to D is energetically unfavourable, the single mutation appeared to have a limited effect on the activity, suggesting a potential coupling role with other mutations.

The computational results presented in Fig. 2 indicate that the mutations have an impact on the stability of TadA8e and ABE8e. To investigate this further, thermal shift assays were conducted using nanoscale differential scanning fluorimetry (nanoDSF) [44] to measure the melting temperatures (Tm) of the purified TadA7.10, TadA8e, miniABEmax and ABE8e proteins, respectively. The UV absorption peaks of size exclusion chromatography indicated that the molecular weights of the TadA proteins were in the range of 44.0–29.0 kDa. Given that the molecular weight of the TadA monomer is approximately 18.7 kDa, this suggests that the TadA proteins form dimers (Fig. 6A-B). Thermal shift assays showed that the Tm of TadA8e is ~ 83.0 oC, while that of TadA7.10 is about 71.1oC. Similarly, that of ABE8e is ~ 69.0 oC, while that of miniABEmax is about ~ 59.9 oC (Fig. 6C). Thus, the thermal stability of TadA8e and ABE8e is increased by ~ 12 oC (Tm) and ~ 9 oC, respectively. This provides experimental evidence that TadA8e and ABE8e have higher thermostability than TadA7.10 and miniABEmax. Thus, we unexpectedly discovered that the eight directed-evolution mutations also significantly improved the thermal stability of TadA8e and ABE8e. This implies that the mutations also optimised the physicochemical properties of the proteins, which are also important for the enzymatic activity of ABE8e.

Fig. 6figure 6

Thermal stability of TadA7.10, TadA8e, miniABEmax and ABE8e. (A) The UV absorption peaks of size exclusion chromatography for TadA7.10 and TadA8e. (B) The SDS-PAGE of TadA7.10, TadA8e and TadA7.10-3mut purified from E. coli Rosetta (DE3) cells. (C) First-order derivatives of the nano-differential scanning fluorimetry (nanoDSF) curves for TadA and ABE. The curves show that TadA7.10:TadA7.10 has a Tm of 71.1 °C, TadA8e: TadA8e has a Tm of 83.0 °C, miniABEmax has a Tm of 59.9 °C and ABE8e has a Tm of 69.0 °C

Comments (0)

No login
gif