The two most common classes of base editors are modular fusion proteins comprised of a deaminase and DNA-binding protein to perform targeted and precise C·G-to-T·A (cytosine base editors (CBEs)) or A·T-to-G·C (adenine base editors (ABEs)) base conversions with high efficiency1,2. In addition to the use of base editors to correct disease-causing genetic mutations, base editors have shown promise as an alternative to nucleases for multiplex gene knockout applications. CBEs can precisely edit arginine, glutamine or tryptophan codons to generate premature stop codons; because this process does not undergo the formation of DNA double-strand breaks, base editing for multiplex gene knockouts minimizes the risk of genomic translocations, cell toxicity and DNA chromothripsis3,4. However, unintended indels and impure base editing byproducts (for example, C-to-G and C-to-A) are still frequently observed when using CBEs. The formation of impure edits could be detrimental and act as a missense mutation in a gene otherwise targeted for knockout. In this regard, an ideal CBE for gene knockout should exhibit properties such as high efficiency, low indel formation, low off-target edits, an expanded editing window to reach more potential bases and, importantly, high product purity for safety purposes.
Because canonical base editors are comprised of different proteins fused together end-to-end, each protein’s orientation may not be best to balance catalytic processes including nontarget strand association/dissociation by the deaminase, target base deamination, base protection/exposure and, ultimately, endogenous cellular DNA repair enzymes. Previous engineering studies have demonstrated that either the use of an inlaid deaminase domain or a circularly permuted Cas protein could affect the on-target or off-target editing efficiencies, product purity or editing windows of base editors5,6,7,8,9. However, the combination of these parameters has not been optimized using any one approach. To obtain a more ideal base editor, we hypothesized that we could treat the deaminase and Cas9 protein together as one complex comprised of different domains. Therefore, by shuffling the orientation of domains from both the deaminase and Cas9 together, which combines concepts of circularly permuted proteins and inlaid deaminases, we hoped to identify a CBE that maximizes the beneficial properties of using CBEs for gene knockout.
We first designed four base editor orientations in which a deaminase was internally embedded within a circularly permuted Cas9(D10A) protein10 (Fig. 1a) based on Cas9 positions previously found to be amenable for circular permutation or for inlaying a small peptide10,11. These architectures are hereby designated as Q base editors (QBE1 to QBE4) to reflect the reconstitution process of circularizing domains from the Cas9 protein and inserting a deaminase internally to generate a new start codon (circle with an internal cut resembling the letter ‘Q’). To characterize the properties of different QBE architectures, six different cytidine deaminases (rAPOBEC1 (ref. 1), hA3A, mini-Sdd3, mini-Sdd6, Sdd7 and mini-Sdd9 (ref. 12)) were evaluated.
a, Schematic representations of the BE4max, QBE1, QBE2, QBE3 (QBEmax), QBE4 editors. Numbers under and above nCas9(D10A) represent amino acid positions in reference to wild-type SpCas9. b, Frequencies of C-to-T or C-to-R conversions (left y axis) and indels (right y axis) induced by mini-Sdd9-BE4max and mini-Sdd9-QBEmax; values and error bars represent the means and s.e.m. for three independent biological replicates. c, Gray scale heat map showing average cytosine base editing frequencies by mini-Sdd9-BE4max and mini-Sdd9-QBEmax at each protospacer position across 17 endogenous sites tested. Numbers below the heat map indicate protospacer positions. d, AlphaFold3-predicted structures of mini-Sdd9-BE4max and mini-Sdd9-QBEmax bind to a sgRNA and DNA target (site 1). Blue, Cas9 domains; orange, sgRNA; purple, dsDNA target; pink, mini-Sdd9. e–g, Average editing frequencies of C-to-T conversions (e), indels (f) and ratio of base edit-to-indel (g) induced by the mini-Sdd9-BE4max or mini-Sdd9-QBEmax editors across 17 endogenous sites; each dot represents the mean for three independent biological replicates for a specified target site, the violin plot shows the base editing frequency distribution with medians and quartiles, and significances are indicated between the mini-Sdd9-BE4max and mini-Sdd9-QBEmax by exact P value using two-tailed Student’s t-test, n = 17. h, Percent of edited reads with C-to-T conversions among edited events at each C1 to C16 base position, cumulated across 17 endogenous sites; values and error bars represent the means and s.e.m. for three independent biological replicates. CMV pro, enhanced cytomegalovirus promoter; DEA, deaminase; NLS, nuclear localization signals; bGH, bovine growth hormone polyadenylation signal.
Plasmids encoding corresponding CBEs were transfected in HEK293T cells and compared with the canonical BE4max architecture13 at two endogenous genomic sites. Deep sequencing revealed that QBE3-based editors showed comparable or higher editing frequencies for four of the six deaminases evaluated; in contrast, QBE1-based, QBE2-based and QBE4-based editors exhibited lower editing frequencies (Supplementary Fig. 1a,b). Compared to the BE4max counterparts, the QBE3-based editors showed substantially lower indels, with an average indel reduction of 60.4% for site 1 and 62.6% for site 3 (Supplementary Fig. 1c,d). We next measured the edit-to-indel ratios and found that six of the seven editors at site 1, and five of the seven editors at site 3 showed substantially higher edit/indel ratios (Supplementary Fig. 1e,f). We then analyzed the editing window and product purities of QBE editors. We observed higher editing efficiencies at PAM-proximal Cs for QBE3 editors and greatly improved product purities compared to BE4max editors (Supplementary Fig. 2a,b). Based on these, we hereby refer to QBE3 as QBEmax (Fig. 1a).
From initial evaluations, we found that mini-Sdd9-based QBE editors exhibited superior editing properties in terms of editing activity, indel formation and product purity. Using mini-Sdd9, we designed seven additional QBEs (QBE5–QBE11; Supplementary Fig. 3a). We found that only one editor, QBE6 demonstrated similar performance to QBEmax in terms of editing efficiency and purity; however, its editing window appeared narrow, so we hereby designate it as QBEn (Supplementary Fig. 3b,c). Because of mini-Sdd9-QBEmax’s overall superior performance and relatively wide editing window, which is desired for expanding the targeting scope of gene knockout applications with base editors, we selected it for further study.
It was reported that fusing the deaminase to the N terminus of a circularly permuted Cas9 (referred to as CP-BE)5 could broaden the editing window, and inlaying the deaminase within the Cas9 protein (referred to as inlaid-BE)6,7,8,9 could affect the editing efficiency or off-target efficiency. We next compared mini-Sdd9-QBEmax with CP-BEs and inlaid-BEs with Cas9 permutation or deaminases inlaid at positions used in the QBE1–QBE11 editors. We found that all CP-BEs and inlaid-BEs induced more impure products than QBEmax. Notably, QBEmax (which uses positions 1,031 and 1,244 for modular assembly) outperforms individual CP-1031 and inlaid-1244 CBEs when comparing the combination of editing frequencies, indel formation and product purities (Extended Data Fig. 1), suggesting that the modularly designed QBEmax architecture enhances desired properties from each functional domain.
To further profile editing properties of mini-Sdd9-QBEmax, we compared mini-Sdd9-QBEmax with mini-Sdd9-BE4max across 17 endogenous genomic sites in HEK293T cells. We found that, in contrast to mini-Sdd9-BE4max, which biases editing at the PAM-distal region, mini-Sdd9-QBEmax showed a wider editing window as far as C16 (Fig. 1b and Supplementary Fig. 4). Aggregate analyses revealed a ‘forward-shifted’ and wider editing window for mini-Sdd9-QBEmax with target Cs between 4 and 14 being favored (Fig. 1c). We used AlphaFold3 (ref. 14) to predict a mini-Sdd9-QBEmax–sgRNA–target DNA ternary structure and compared it with that of the corresponding BE4max architecture (Fig. 1d). We found that these structures revealed the deaminase in mini-Sdd9-QBEmax being more closely associated to PAM-proximal Cs, while in BE4max, more closely associated to PAM-distal Cs, which is consistent with experimental results from genomic edits.
The average editing frequencies across 17 genomic sites for mini-Sdd9-QBEmax and mini-Sdd9-BE4max were 52.4 ± 2.4% and 54.5 ± 2.2%, respectively (Fig. 1e). Mini-Sdd9-QBEmax induced lower indels at 16 of 17 sites tested, with average indel frequencies decreasing by 56.5% from 2.8 ± 0.3% to 1.2 ± 0.2% (Fig. 1f and Supplementary Fig. 4), which also substantially increases average edit-to-indel ratios (Fig. 1g). We next evaluated cytosine base editing product purities, which is calculated as the proportion of ‘C’ edited to ‘T’ as opposed to ‘G’ or ‘A’, for each position within the protospacer and aggregated all sites together. Importantly, mini-Sdd9-QBEmax exhibited superior product purities (99.4% ± 0.4%) at all positions within a C1–C16 editing window compared to that of mini-Sdd9-BE4max (95.5% ± 2.8%; Fig. 1b,h and Supplementary Fig. 4). To test the versatility of the QBEmax, we evaluated mini-Sdd9-QBEmax in additional mammalian cell lines, including A549, HeLa and HCT116. QBEmax exhibited higher or comparable editing efficiencies, improved product purities and decreased indels in all cell lines tested (Extended Data Fig. 2). These results highlight QBEmax in achieving efficient and precise base edits at a flexible editing window with minimal indel and byproducts.
Chimeric antigen receptor-T cell (CAR-T) therapy has demonstrated success as a cancer immunotherapy for hematological malignancies. Many clinical trials and research studies have found that multiplex knockout of genes related to immune rejection and graft-versus-host disease (GvHD) would further benefit CAR-T therapy in terms of both durability and potency. Encouraged by the performance of mini-Sdd9-QBEmax, we next sought to perform multiplex gene knockout to simultaneously edit genes that could compromise the efficacy of CAR-Ts. Five genes, PD-1 (ref. 15), CISH16, Fas17, TGFBR2 (truncating off the endodomain)18,19,20 and TRAC21, were selected, which all previously demonstrated potential in improving CAR-T performance when knocked out or downregulated.
We first identified all possible SpCas9 protospacers with an NGG PAM and target C located within codons encoding tryptophan (W), arginine (R) or glutamine (Q), so that a cytosine base edit would generate a stop codon (TAA, TAG or TGA). We obtained 46, 21, 16, 30 and 4 protospacers for PD-1, CISH, Fas, TGFBR2 and TRAC, respectively (Supplementary Table 1). We next filtered for potential off-target sites (mismatches ≤ 3) and ultimately selected 16 targets for PD-1, 9 for CISH, 10 for Fas, 10 for TGFBR2 and 3 for TRAC. Lastly, we included one additional target for CISH, PD-1 and TRAC, which disrupts a splice site to perform gene knockout as reported previously21,22 (Fig. 2a).
a, Schematic representations showing PD-1, CISH, Fas, TGFBR2 and TRAC genes. Light blue boxes indicate exons of genes, and short red lines represent the position of selected protospacers. b–d, Average editing frequencies of desired editing efficiencies (b), indels (c) and ratio of base edits to indels (d) induced by mini-Sdd9-BE4max and mini-Sdd9-QBEmax editors for PD-1 (n = 17), CISH (n = 10), Fas (n = 10), TGFBR2 (n = 10) and TRAC (n = 4) genes; each dot represents the mean for three independent biological replicates for a specified target site and the violin plot shows base editing frequency distribution with medians and quartiles. e, Percent of edited reads with C-to-T or C-to-R conversions at target genes indicated with values and error bars representing the means and s.e.m. for three independent biological replicates across all target sites for PD-1 (n = 17), CISH (n = 10), Fas (n = 10), TGFBR2 (n = 10) and TRAC (n = 4) genes; significances are indicated between BE4max and QBEmax by exact P value using two-tailed Student’s t-test. f, Schematic representations of potential editing outcomes induced by C-to-T, C-to-G and C-to-A conversions of tryptophan (W), arginine (R) and glutamine (Q). g–i, Desired editing efficiencies (g), indels (h) and percent of edited reads with C-to-T or C-to-R conversions at target genes indicated (i) induced by mini-Sdd9-BE4max and mini-Sdd9-QBEmax editors during multiplexed base editing of five genes; values and error bars represent the means and s.e.m., respectively, for four independent biological replicates. j, Single-cell colony analysis of multiplex base editing distributions in unsorted and sorted cell populations. k, Schematic representation of the experimental design for the R-loop assay. l, Frequencies of C-to-T conversions at the dSaCas9-induced R-loop sites; values and error bars represent the means and s.e.m. for four independent biological replicates. m, Number of C-to-U RNA variants induced by the editors indicated; values and error bars represent the means and s.e.m. for two (Cas9(D10A)) or four (BE4max and QBEmax) independent biological replicates, significances are indicated by exact P value using one-way ANOVA Tukey’s multiple comparisons.
HEK293T cells were transfected with mini-Sdd9-BE4max or mini-Sdd9-QBEmax together with each sgRNA plasmid. We then analyzed desired editing efficiencies (calculated as percent C-to-T for stop codon creation), indel frequencies, desired edit-to-indel ratios and product purities. We found that mini-Sdd9-QBEmax achieved comparable or slightly higher average desired editing at the target base for all sites aggregated for each of the five genes (Fig. 2b). Average indel frequencies for all sites aggregated decreased by 76%, 75%, 59% and 71% for PD-1, CISH, Fas and TGFBR2, respectively (Fig. 2c). The average indel frequency at TRAC was 0.8 ± 0.15% for mini-Sdd9-BE4max and 1.0 ± 0.34% for mini-Sdd9-QBEmax due to one outlier at TRAC-site 3 whereby mini-Sdd9-BE4max and mini-Sdd9-QBEmax exhibited 1.4 ± 0.18% and 2.9 ± 0.22%, respectively. Cumulatively, the desired edit-to-indel ratios induced by mini-Sdd9-QBEmax were 3.26, 3.99, 2.02, 9.91 or 1.99-fold higher than that of mini-Sdd9-BE4max (Fig. 2d). Importantly, average product purities at the target cytosine base for stop codon creation by mini-Sdd9-QBEmax and mini-Sdd9-BE4max were 99.7% versus 95.7% for PD-1, 99.7% versus 97.7% for CISH, 99.7% versus 96.5% for Fas, 99.8% versus 98.0% for TGFBR2 and 99.5% versus 95.8% for TRAC (Fig. 2e). This increase in product purity minimizes the formation of missense mutations from imprecise C-to-G or C-to-A edits (Fig. 2f). When analyzed individually, mini-Sdd9-QBEmax induced lower indels at 47 of the 51 target sites and higher product purities at 45 of the 51 target sites (Extended Data Figs. 3–7).
For each gene, we identified one ideal guide and next edited PD-1, CISH, Fas, TGFBR2 and TRAC simultaneously for multiplex gene knockout. We transformed plasmids for each of the five sgRNAs together with QBEmax or BE4max editors into HEK293T cells. We observed that mini-Sdd9-QBEmax exhibited comparable or higher editing compared to mini-Sdd9-BE4max across all five sites in HEK293T cells in the absence of any selection pressure (Fig. 2g). Notably, mini-Sdd9-QBEmax achieved lower indel formations (Fig. 2h) and exhibited superior product purity (Fig. 2i) at all five genes. To validate that all five base edits occurred in a single cell, we sequenced 48 and 112 QBEmax-transfected single-cell colonies arising from unsorted or sorted cell populations, respectively. We found that 21 (43.8%) and 100 (89.3%) cell colonies exhibited all five genes edited, respectively, demonstrating successful multiplex base editing by QBEmax in a single cell (Fig. 2j). To further evaluate the potential of QBEmax, we co-electroporated QBEmax and all five sgRNA plasmids into an immortal Jurkat T cell line. In these T cells, QBEmax also exhibited superior editing efficiencies, improved product purities and decreased indels, which is similar to its performance in HEK293T cells and further supports the versatility of QBEmax for safe and robust base editing in clinical applications (Extended Data Fig. 8).
Because DNA off-targets are a major concern for base editing therapeutic applications, we next evaluated Cas-independent DNA off-target effects of mini-Sdd9-QBEmax using the orthogonal R-loop assay23,24,25. We cotransfected a dead-SaCas9 (dSaCas9) and sgRNA to induce the formation of an orthogonal R-loop simultaneously with the multiplexed gene knockout strategy (Fig. 2k). We evaluated five orthogonal sites and deep sequencing at each orthogonal R-loop showed that mini-Sdd9-QBEmax induced lower Cas-independent off-target editing at all five R-loop sites compared to that of mini-Sdd9-BE4max (Fig. 2l). We next evaluated the RNA off-target effects of mini-Sdd9-QBEmax. We conducted whole transcriptome sequencing and analyzed the number of C-to-U variants in QBEmax, BE4max and nCas9 (D10A) treated samples together with a sgRNA plasmid targeting CISH. We found that QBEmax induced substantially lower RNA off-target edits on transcriptome-wide RNA transcripts compared to that of the BE4max without compromising DNA on-target editing (Fig. 2m and Supplementary Fig. 5). The robust desired editing efficiencies, minimized indels, high product purities and decreased DNA and RNA off-target effects portray QBEmax as an ideal base editor for multiplex gene knockout applications.
We next sought to probe the molecular basis by which mini-Sdd9-QBEmax embodies its desired properties. We performed molecular dynamic (MD) simulation analyses based on the AlphaFold3-predicted ternary structures of mini-Sdd9-QBEmax or BE4max with a sgRNA and target DNA (Fig. 1d). With these models, all-atom MD simulations of approximately 300 ns were performed (Supplementary Fig. 6a,b). We first investigated the conformational stability of these two systems by projecting their free energy landscapes onto corresponding root mean square deviation (RMSD) and radius of gyration (Rg) components. We found that both the RMSD and Rg of mini-Sdd9-QBEmax were lower than that of mini-Sdd9-BE4max, and only one single stable energy state was observed (Fig. 3a,b and Supplementary Fig. 6b). This suggests that the QBEmax architecture better treats the deaminase and Cas protein as one complex so that each domain is oriented compactly within itself. At the minimum energy state, while the deaminase was predicted to be associated with the nontarget strand in both systems (Supplementary Fig. 6c,d), the RMSD of mini-Sdd9-QBEmax system was lower throughout the 300 ns MD process (Supplementary Fig. 6b). When analyzing individual amino acids, we observed that the root mean square fluctuation (RMSF) of the linkers connecting the deaminase to the Cas protein were lower in the QBEmax system (Fig. 3c,d) compared to the BE4max architecture. We speculate a more compact QBEmax architecture that limits the deaminase from sporadically swinging in space, thereby contributing to lower Cas-independent DNA off-target editing, lower indel formation and higher product purity.
a,b, The free energy landscape against RMSD and Rg for mini-Sdd9-BE4max (a) and mini-Sdd9-QBEmax (b) during a 300 ns MS simulation. c,d, RMSF plot for mini-Sdd9-BE4max (c) and mini-Sdd9-QBEmax (d) in the MD simulation, systems equilibrated after 150 ns; schematic representations of editors are shown above the plots. e, SASA analysis of Cs within the editing window of site 1 in predicted mini-Sdd9-BE4max and mini-Sdd9-QBEmax ternary structures; each replicate represents the SASA by a 1.0 nm probe during a 1 ns time scale; n = 150 ns following system equilibration, boxes and lines represent the interquartile range (IQR) and median, respectively, and whiskers represent 1.5× IQR. f, Snapshots showing exposed Cs in the editing window. Blue, Cas9 and UGI; yellow, linker; green, mini-Sdd9 deaminase.
During cytosine base editing, intermediate uracil cleavage by endogenous uracil DNA glycosylase (UNG) drives the formation of indels and imprecise C-to-G or C-to-A edits1,26. We envisioned that an ideal base editor adopts a compact and protective conformation for the exposed R-loop so that the intermediate uracil base is not excised before cellular mismatch repair resolving a permanent C-to-T conversion. To evaluate R-loop exposure, we performed solvent accessibility analyses using a 1.0 nm probe and found that the solvent-accessible surface area (SASA) was increased for C3 and substantially increased for C7 and C8 in the mini-Sdd9-BE4max compared to the mini-Sdd9-QBEmax architectures (Fig. 3e,f), suggesting that these residues are accessible by UNG and ultimately form indels and byproducts. We also speculated that the distance of the UGI to the ssDNA target bases may affect product purity. We evaluated the position and distance of the two UGI domains to the ssDNA target bases in the QBEmax, BE4max or individual CP-1031 and inlaid-1244 CBEs based on molecular dynamic modeling data. We observed that indeed both UGI domains in QBEmax exhibited a relatively shorter distance to the target bases in the ssDNA R-loop region, which suggests an inverse relationship between UGI positioning to product purities and indel formation, as others previously have also identified5 (Supplementary Fig. 6e).
Based on these results, we propose a model for QBEmax base editing. In cytosine base editing using a canonical BE4max architecture, indel formation and impure C-to-G or C-to-A base edits arise from uracil excision and abasic site formation. Because QBEmax exhibits a more compact architecture, limits deaminase swinging and shields the Cas9-induced R-loop, base editing intermediates are protected from cellular UNG excision before Cas9 detaching from the target DNA and subsequent mismatch repair. Therefore, a protective and compact base editor conformation reduces unintended effects driven by DNA repair processes and further promotes desired base editing events (Extended Data Fig. 9).
Taken together, we designed and identified a base editor architecture, QBEmax, which achieves high efficiency on-target editing while decreasing low indel formation, exhibits high edit product purities and minimizes DNA off-targets. The development of QBEmax serves as a promising base editor architecture for developing more efficient and precise base edits toward the use of base editing in multiplex therapeutic applications such as CAR-T immunotherapies. The efficient delivery of QBEmax in vivo will further expand on its use and analysis of desired base editing properties. Advances in protein prediction and MD further help shed light on the molecular basis of genome editors and would greatly aid in future developments of new editing technologies.