MATERIALS AND METHODSPlasmidsAll vectors were created by classical restriction enzyme cloning. Oligonucleotides as well as synthetic, double-stranded DNA fragments (gBlocks) were obtained from Integrated DNA Technologies. A list of all constructs used or created in this study is shown in table S3, and annotated vector sequences (GenBank files) are provided in data file S1. Vectors expressing Cas9, Cas9 fused to GFP (Cas9-GFP), wild-type AcrIIA4, different AcrIIA4-LOV2 hybrids, or a U6 promoter–driven sgRNA bearing the improved F+E scaffold (38) have been previously reported by us (29, 37) (see Addgene no. 113033-113039). The mCherry-AcrIIA4 vector was created by fusing an mCherry coding sequence to the N terminus of wild-type AcrIIA4 using overlap extension polymerase chain reaction (PCR). A construct expressing Cas9 fused to wild-type AcrIIA4 via a 40-residue glycine-serine (GS) linker was created by cloning a synthetic DNA fragment encoding the GS linker–AcrIIA4 fragment into vector CMV-SpyCas9 (Addgene no. 103033) via Eco RI/Hind III. The AcrIIA4 fragment in the resulting construct was subsequently replaced by PCR fragments encoding AcrIIA4-LOV2 fusions or AcrIIA4 point mutants via Bam HI/Hind III. To generate the AcrIIA4-LOV2 PCR fragments, our previously reported AcrIIA4-LOV2 vectors were used as template (29). The AcrIIA4 point mutants were created by first amplifying a vector encoding wild-type AcrIIA4 (Addgene no. 113037) with 5′-phosphorylated primers introducing the point mutation(s). The resulting vectors were then used as template to generate PCR fragments encoding AcrIIA4 mutants. sgRNA expression vectors were created by inserting target complementary sequences into vector pAAV–RSV–GFP–U6–sgRNA scaffold (Addgene no. 113039) by oligo cloning via Bbs I.BPK4410, a human expression plasmid for SpCas9 Cluster 1 (HypaCas9), was a gift from J. Doudna and K. Joung (Addgene plasmid no. 101178; http://n2t.net/addgene:101178; RRID:Addgene_101178). xCas9 3.7 was a gift from D. Liu (Addgene plasmid no. 108379; http://n2t.net/addgene:108379; RRID:Addgene_108379). p3s-Sniper-Cas9 was a gift from J. Lee (Addgene plasmid no. 113912; http://n2t.net/addgene:113912; RRID:Addgene_113912). In all cloning procedures, PCRs were performed using Q5 Hot Start High-Fidelity DNA Polymerase New England Biolabs (NEB) followed by agarose gel electrophoresis to analyze PCR products. Bands of the expected size were cut out from the gel, and the DNA was purified by using the QIAquick Gel Extraction Kit (Qiagen). Restriction digests and ligations were performed with corresponding enzymes from NEB by applying the manufacturer’s protocols. Following ligation, plasmids were transformed into chemically competent Top10 cells, and plasmids were extracted and purified using the QIAamp DNA Mini or Plasmid Plus Midi Kit (all from Qiagen).Cell cultureBefore use, all cell lines were authenticated and tested negative for mycoplasma contamination via a commercial service (Multiplexion, Heidelberg). Cells were maintained at 5% CO2 and at 37°C in a humidified incubator and passaged every 2 to 4 days, i.e., when reaching 70 to 90% confluency. HEK 293T and HeLa cells were cultivated in 1× Dulbecco’s modified Eagle’s medium supplemented with 2 mM l-glutamine, penicillin (100 U/ml), streptomycin (100 μg/ml) (all Thermo Fisher Scientific), and 10% (v/v) fetal calf serum (Biochrom AG). The Huh-7 medium was additionally supplemented with 1 mM nonessential amino acids (Thermo Fisher Scientific).AAV lysate productionLow-passage HEK 293T cells were used for the production of AAV-containing cell lysates. Cells were seeded into six-well plates (CytoOne) at a density of 350,000 cells per well. The following day, cells were triple-transfected with (i) an AAV helper plasmid carrying AAV2 rep and cap genes, (ii) an adenoviral plasmid providing helper functions for AAV production, and (iii) the AAV vector plasmid using 1.33 μg of each construct and 8 μl of TurboFect Transfection Reagent (Thermo Fisher Scientific) per well. The AAV vector plasmid encoded either (i) a U6 promoter–driven sgRNA targeting the AAVS1 locus as well as an RSV promoter–driven GFP (used as transduction reporter), (ii) Cas9, or (iii) the AcrIIA4 variant (LOV2-AcrIIA4 Insertion 3 in table S2; Addgene no. 113036). Three days after transfection, cells were collected in 300 μl of phosphate-buffered saline (PBS) and subjected to five freeze-thaw cycles by alternating between snap-freezing in liquid nitrogen and 37°C in a water bath. Centrifugation at 18,000g was applied for 10 min to remove cell debris, and the supernatant containing AAV particles was stored at 4°C until use.T7 endonuclease assay, TIDE sequencing, and targeted amplicon sequencingTable S4 shows the genomic ON-target/OFF-target sites relevant to this study. For transfection-based T7 assays, HEK 293T cells were seeded at a density of 12,500 cells per well and a culture volume of 100 μl per well into 96-well plates (Eppendorf). The following day, cells were transfected with jetPRIME (Polyplus-transfection) using 0.3 μl of jetPrime reagent per well and as detailed in the following. For experiments shown in Fig. 1E and figs. S2 and S3, cells were cotransfected with (i) 66 ng of Cas9 expression construct, (ii) 66 ng of sgRNA constructs, and (iii) different doses of Acr construct as indicated in the figures. To keep the total amount of DNA transfected constant between all samples, DNA was topped up to 200 ng per well using an irrelevant vector. For experiments shown in Fig. 2 and figs. S4, S5, S6, and S7C, cells were cotransfected with (i) 66 ng of Cas9 or Cas-Acr vector, (ii) 66 ng of sgRNA expression construct, and (iii) 66 ng of an irrelevant stuffer DNA.For transduction-based T7 assays, HEK 293T cells were seeded at a density of 3500 cells per well, and HeLa and Huh-7 cells were seeded at a density of 3000 cells per well into 96-well plates. The following day, cells were cotransduced with 33 μl of Cas9 AAV lysate, 33 μl of sgRNA AAV lysate, and the indicated volume of AcrIIA4 AAV lysate. The transduction volume was always topped up with PBS to a total volume of 100 μl. The transduction was repeated 24 hours after the first transduction. Seventy-two hours after transfection or after (first) transduction, the medium was aspirated and cells were lysed using DirectPCR lysis reagent (VIAGEN Biotech) supplemented with proteinase K (Sigma-Aldrich).For T7 assays and TIDE sequencing, the genomic target locus and relevant off-target loci were PCR-amplified with primers flanking the corresponding ON-target/OFF-target sites (table S5) using Q5 Hot Start High-Fidelity DNA Polymerase (NEB). For TIDE sequencing analysis, PCR amplicons were purified from 1% agarose gels using the QIAquick Gel Extraction Kit (Qiagen) followed by Sanger sequencing (Eurofins, Germany). Percentages of modified sequences were quantified using the TIDE web tool (https://tide.deskgen.com/). For T7 assays, five microliters of PCR amplicon was diluted 1:4 in buffer 2 (NEB), and then heated up to 95°C and slowly cooled down to room temperature to allow heteroduplex formation using nexus GSX1 Mastercycler (Eppendorf) and the following temperature steps: 95°C/5 min, 95° to 85°C at −2°C per second, 85° to 25°C at −0.1°C per second. Then, 0.5 μl of T7 endonuclease (NEB) was added; samples were mixed and incubated for 15 min at 37°C. Next, gel loading dye (NEB) supplemented with 1% GelRed (Biotium) was added and samples were then loaded onto 2% tris-borate-EDTA agarose gels. Voltage (100 V) was applied for 40 min to resolve DNA fragments. The Gel iX20 system equipped with a 2.8-megapixel/14-bit scientific-grade charge-coupled device camera (INTAS) was used for gel documentation. To calculate the InDel percentages from the gel images, T7 bands were quantified using the ImageJ (http://imagej.nih.gov/ij/) gel analysis tool. Peak areas were measured, and percentages of insertions and deletions InDel (%) were calculated using the formula indel (%) = 100 × (1 − (1 − fraction cleaved)½), whereas the fraction cleaved = ∑(cleavage product bands)/∑(cleavage product bands + PCR input band). Full-length T7 assay gel images are shown in fig. S12.For targeted amplicon sequencing, a first-step PCR was performed by PCR amplifying the genomic ON-target/OFF-target loci with primers carrying 5′ Illumina Nextera sequencing adaptors (forward: 5′-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-locus-specific sequence-3′; reverse: 5′-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-locus-specific sequence-3′) (table S5). The second-step PCR for introducing barcodes, sequencing on an Illumina MiSeq machine and downstream bioinformatics for quality control, and calling of CRISPR-induced InDels was performed via the CRISPR-Cas9 commercial sequencing service (Microsynth) using their in-house pipelines.Fluorescence microscopy and image analysisCells were seeded into eight-well Glass Bottom μ-Slides (ibidi) at a density of 9000 cells per well for HeLa and 10,000 cells per well for HEK 293T and a volume of 300 μl of medium per well. The following day, cells were cotransfected with (i) 25 ng of Cas9-GFP, 25 ng of sgRNA AAVS1 construct, and 25 ng of stuffer DNA (pBluescript) or (ii) 25 ng of Cas9-GFP, 25 ng of mCherry-AcrIIA4, and 25 ng of sgRNA AAVS1 construct using 0.2 μl of JetPrime per well. Imaging was performed at 12, 18, 24, 48, and 72 hours after transfection using a Leica SP8 confocal laser scanning microscope equipped with automated CO2 and temperature control; an ultraviolet, argon, and solid-state laser; as well as an HCX PL APO 40× oil objective (numerical aperture = 0.7). The identical imaging settings were applied to all samples as detailed in the following. GFP fluorescence was recorded using the 488-nm laser line for excitation, and the detection wavelength was set to 493 to 578 nm. mCherry fluorescence was recorded using the 552-nm laser line for excitation, and the detection wavelength was set to 578 to 789 nm. Laser power was 0.25%, and gain was set to 800 V. For each field of view, a 40-μm Z-stack (40 slices) was recorded, and five fields of view were recorded per sample and time point. A single-plane bright-field image was recorded in parallel. A previously reported HeLa reference cell line expressing known GFP and mCherry molecule numbers per cell (39) was subjected to the identical imaging conditions.For image analysis, cells were manually segmented using the freehand selection tool in ImageJ using the bright-field channel, and the area of each cell was measured. The segments were then applied to measure mean fluorescence in z-projections of the GFP and mCherry stacks. The number of fluorescent molecules per cell was then calculated using the following formulaFM(sample)=A(sample)·I(sample)A(ref)·I(ref)·FM(ref)whereby FM(sample) and FM(ref) represent the number of fluorescent molecules, A(sample) and A(ref) represent the cell area, and I(sample) and I(ref) represent the fluorescence intensity, after background subtraction, in a particular cell in the sample cell or reference (ref) cell line, respectively.Mathematical modeling and parameter estimationTo quantitatively describe gene editing dynamics by Cas9 or Cas-Acr variants, an ODE model was developed. The model describes the transient expression of sgRNAs, Cas9 or Cas-Acr mRNAs, and Cas9 or Cas-Acr proteins, binding of sgRNAs to Cas9 or Cas-Acr variants, activation and inhibition of Cas-Acr variants, as well as gene editing by active complexes of Cas9 or Cas-Acr variants and sgRNAs. A model without Cas-Acr species was used for initial simulations (table S1). This model consisted of nine equations containing a total of 11 parameters. Three types of models were defined for simultaneous model fitting to experimental data: (i) a model describing turnover of plasmids, mRNAs, and proteins, consisting of 7 equations; (ii) a Cas9 model consisting of 39 equations; and (iii) Cas-Acr models containing 47 equations (table S6 and data file S2). A total of 32 parameters were estimated by model fitting to 75 data points.In the following, model assumptions and steps to iteratively refine the model shall be described. The experimental dataset comprised measurements related to protein turnover and gene editing. However, several reactions in between were experimentally inaccessible. For this reason, we tried to limit the problem of parameter unidentifiability by parsimoniously defining model parameters. Taking into account that the sizes of plasmids for expressing sgRNAs, Cas9, or Cas-Acr variants were of the same order of magnitude, the same degradation rate was assumed for all plasmids. Furthermore, the model assumes the same degradation rate kdeg,C for Cas9 and all Cas-Acr species (Cas-Acr, Cas-Acrinh, Cas-Acr:sgRNA, and Cas-Acrinh:sgRNA) independent on their activation state and sgRNA binding. Similarly, degradation of different sgRNAs was described by one parameter, kdeg,gRNA. If Cas-Acr as part of complexes with sgRNA is degraded, it is assumed that sgRNA remains within the cell. Thereby, the model pertains flexibility regarding a potential sgRNA-rescuing effect of Cas9 in consistence with the observation that otherwise very short-lived sgRNA is protected from degradation after binding to Cas9 (40).We assumed that binding of sgRNA to Cas9 or Cas-Acr variants was fast compared to other processes such as translation or gene editing. A quasi-steady state was enforced by fixing the binding parameter kgRNA,on to a large value and effectively only estimating the dissociation constant Kd,gRNA = kgRNA,off/kgRNA,on. At first, we tried to fit the model with equal Kd,gRNA values for the sgRNAs targeting four different genes. We realized that estimating Kd,gRNA individually for sgRNAs resulted in a substantially improved model fit, indicated by a difference in the Akaike information criterion of ΔAIC = 53. This implies that affinities to Cas9 or Cas-Acr variants varied between sgRNAs.Similarly, we first assumed the same parameter for the maximal editing efficiency for experiments in all targeted genes. In the model, this parameter served as initial value Dtot for the fraction of unedited genes. In case that all target sites in transfected cells can be edited, this parameter equals the percentage of transfected cells expressing sgRNA and Cas9 or Cas-Acr variants. Estimating this parameter individually for the four edited genes, Dtot,i with i = 1...4 for AAVS1, EMX1, RUNX1, and HEK, improved the model fit considerably (ΔAIC = 42).For Cas-Acr variants, a common activation parameter but individual inactivation parameters were defined. Inhibitor strengths for Cas-Acr variants with mutated AcrIIA4 (Ins. 5, N39A, and D14A/G38A) were estimated relative to Cas-Acr wt with unmodified AcrIIA4. To this end, inhibition parameters for Cas-Acr variants were defined as a product between the common parameter kinh,CascAID and parameters γj with j = 1...4 for Cas-Acr wt, Ins. 5, N39A, and D14A/G38A, and γ1 ≡ 1 for Cas-Acr wt (table S6).It was observed before that mismatches between gRNAs and target sites take influence on the unbinding rather than on the binding kinetics of Cas9-gRNA complexes (40). For this reason, individual parameters koff,target,i were estimated for the unbinding of active Cas9:sgRNA or Cas-Acr:sgRNA complexes from target sites, whereas the common binding parameter kon,target was defined for all genes. To explain differences between ON-target and OFF-target editing, factors ϕOFF target,i between unbinding rates of Cas9:sgRNA or Cas-Acr:sgRNA complexes for ON- and OFF-targets were estimated separately for the four genes.Besides the model variables Di, the concentrations of plasmids for expression of gRNA (PgRNA), coexpression of Cas9 and Acr (PCas9 + Acr), or expression of Cas9 or Cas-Acr variants alone (PC) were the only model species with nonzero initial values. Because plasmid concentrations in cells were not experimentally accessible, they were treated in the model as dimensionless magnitudes, and initial concentrations of PgRNA and PCas9 + Acr were fixed to 1. As described in table S6, the initial concentration of the plasmid for Cas9 expression in the turnover model, PC0, was furthermore associated with the initial plasmid concentration in the Cas9 and Cas-Acr models because the same plasmid amounts were used in experiments described by these models.Residuals between model observables and experimental measurements were weighted by the SEM if replicates for data points were available. Protein expression measurements were determined from averages of n = 50 cells. Quadruplicates were measured for AAVS1 editing efficiencies in case of Cas9 and all Cas-Acr variants at 72 hours. Triplicates were measured for EMX1, RUNX1, and HEK editing efficiencies. Single measurements were available for AAVS1 editing efficiencies at 24 and 48 hours. In this case, residuals were weighted using an error model. To this end, the linear error model ε = m1y + m2 was fitted to all SEM values and editing efficiencies y measured with replicates, resulting in parameter estimates m1 = 0.054 and m2 = 0.681.Model simulations were performed with custom scripts in MATLAB (The MathWorks, Natick, MA, USA). For parameter estimations, the MATLAB toolbox PottersWheel (www.potterswheel.de) was used (41). A total of 500 multistart local optimizations were conducted followed by profile likelihood estimation to determine parameter confidence intervals. Parameter estimates, parameter bounds, and parameter confidence intervals are listed in table S7. For simulating the model parts documented in table S6 using the parameter estimates listed in table S7, MATLAB files are available as supplementary data (data file S2).
The limited target specificity of CRISPR-Cas nucleases poses a challenge with respect to their application in research and therapy. Here, we present a simple and original strategy to enhance the specificity of CRISPR-Cas9 genome editing by coupling Cas9 to artificial inhibitory domains. Applying a combination of mathematical modeling and experiments, we first determined how CRISPR-Cas9 activity profiles relate to Cas9 specificity. We then used artificially weakened anti-CRISPR (Acr) proteins either coexpressed with or directly fused to Cas9 to fine-tune its activity toward selected levels, thereby achieving an effective kinetic insulation of ON- and OFF-target editing events. We demonstrate highly specific genome editing in mammalian cells using diverse single-guide RNAs prone to potent OFF-targeting. Last, we show that our strategy is compatible with different modes of delivery, including transient transfection and adeno-associated viral vectors. Together, we provide a highly versatile approach to reduce CRISPR-Cas OFF-target effects via kinetic insulation.