We used HEK293T cells (Clontech, Mountain View, CA) carrying a genomically integrated dscGFP gene driven by the TRE3G promoter (consisting of seven repeats of the Tet response element) (Kempton et al., 2020). This cell line was clonally sorted and expanded and showed no background GFP fluorescence. Cells were cultured in DMEM + GlutaMAX (Thermo Fisher, Waltham, MA) containing 100 U/mL of penicillin and streptomycin (Thermo Fisher) and 10% Fetal Bovine Serum (Clontech). Cells were grown at 37°C with 5% CO2 and passaged using 0.05% Trypsin-EDTA solution (Thermo Fisher) or TryplE Express Enzyme (Thermo Fisher).
Cells were transfected with constructs carrying 1) the nuclease-deactivated (D832A) dCas12a (from Lachnospiraceae bacterium, human codon-optimized) (Zetsche et al., 2015) fused either to the VP64-p65-Rta (VPR) activator (Chavez et al., 2015) and mCherry, or to mini-VPR (Vora et al., 2018) and mCherry; 2) a CRISPR array-expressing plasmid. For Figures 1 and 3, a CRISPR array construct consisting of firefly luciferase immediately followed by a CRISPR array and an SV40 pA terminator, expressed under the CAG promoter element, was used (Supplementary File 1). For the activation of seven endogenous genes (Fig. 4), firefly luciferase was replaced with BFP and a Malat1 Triplex sequence (Campa et al., 2019) followed by the L. bacterium Cas12a leader sequence (Supplementary File 1).
Cells were seeded one day before transfection at a density of 5×104 cells per well in a 24-well plate. Cells were transfected using TransIT-LT1 transfection reagent (Mirus Bio, Madison, WI) according to the manufacturer’s recommendation (250 ng dCas12a-VPR-mCherry plasmid; 250 ng CRISPR array plasmid; 1.5 μl transfection reagent per well).
Two days after transfection, cells were dissociated using 0.05% Trypsin-EDTA or TrypLE (Thermo Fisher), passed through a 40 μm filter-capped test tube (Corning, Corning, NY), and analyzed using either a CytoFLEX S flow cytometer (Beckman Coulter, Brea, CA) or a BD Influx FACS machine (BD Biosciences, Franklin Lakes, NJ) or a BD FACSMelody (BD Biosciences). For each experiment, 10,000 events were recorded. During flow cytometry analysis, we gated for cells expressing the Cas12a construct (mCherry+) but not the CRISPR array construct. That is because array processing by Cas12a severs the upstream reporter gene from the poly-A tail, thus potentially disturbing reporter gene expression and thereby the analysis. For Figs. 1 and 3, three replicates were performed for each sample.
RT-qPCR to quantify endogenous gene activation (Fig. 4)
Cells were transfected as described above. For cell harvesting, all cells in each well were dissociated and included in the analysis and were thus not sorted based on uptake of Cas12a or CRISPR array plasmids. Two biological replicates were performed. Total RNA was extracted with the RNeasy Plus Mini Kit (Qiagen, Germany), according to manufacturer’s instructions. Reverse transcription was performed using iScript cDNA Synthesis kit (Bio-Rad, Hercules, CA). Quantitative PCR reactions were run on a LightCycler thermal cycler (Bio-Rad) with iTaq Universal SYBR Green Supermix (Bio-Rad). ΔΔCt values for the target genes were divided by those of RPL13A to obtain relative expression. crRNA spacers and RT-qPCR primers are listed in Table S3.
Assembly of CRISPR arrays
CRISPR arrays were assembled using an oligonucleotide duplexing and ligation method that we developed. First, arrays were designed computationally using SnapGene software (v. 5.1-5.2; Insightful Science, San Diego, CA). The arrays were designed to include two flanking sequences containing a 20-bp overlap with the opened backbone plasmid, as required for a subsequent In-Fusion reaction. This double-stranded CRISPR array sequence was then computationally divided into ≤60-nt DNA sequences with unique 4-nt 5’ overhangs, which were ordered from Integrated DNA Technologies (IDT, Coralville, IA) in LabReady formulation (100 μM in IDTE buffer, pH 8.0) and standard desalting purification. For each ligation vial, an oligonucleotide mix was first made containing 1 μl of each oligonucleotide. Up to 16 single-stranded oligonucleotides (i.e. corresponding to 8 oligo duplexes) were ligated per reaction vial. For CRISPR arrays longer than that, the reaction was divided into multiple vials, each vial containing ≤8 oligonucleotide duplexes (e.g. if the array consists of 12 oligonucleotide duplexes, the reaction was performed in two vials with 6 duplexes in each).:
Phosphorylation and duplexing
Then run a phosphorylation-duplexing reaction on a thermocycler
Then, add 1 reaction volume (5 μl) of 1x T7 buffer (2.5 μl 2x T7 buffer + 2.5 μl water). Add 1 μl T7 DNA ligase (New England Biolabs, MA, USA) (Important: Use T7 ligase rather than T4 ligase, as T7 ligase lacks the ability to ligate blunt ends). Incubate at 25°C for 3 hours. Then, dilute the sample 1/5 by adding 40 μl water. Run the sample on a 2% agarose gel. A ladder pattern should be visible. Excise the band corresponding to the ligated product. Depending on whether the entire CRISPR array was assembled in a single vial, or divided into several vials, do either of the following:
If the entire array was assembled in a single vial
Gel-purify the excised band using the Macherey-Nagel NucleoSpin Gel & PCR Clean-up kit (Macherey-Nagel, Germany). Insert the purified array into the opened plasmid backbone using In-Fusion cloning (Takara Bio, Japan).
If the array was divided into >1 vial
For all excised bands belonging to the same array, pool the excised bands into a single vial. Gel-purify the pooled bands using the Macherey-Nagel NucleoSpin Gel & PCR cleanup kit. Elute in 15 μl water. Then, add 1 volume (15 μl) of 2x T7 buffer and 1 μl T7 DNA ligase. Incubate at 25°C for 3 hours. Then, run the ligated product on a 2% agarose gel. A faint band should be seen corresponding to the full-length CRISPR array. Excise and gel-purify this band. Insert into backbone vector using In-Fusion.
Design of short CRISPR arrays (2 crRNAs) for testing effect of GC content of upstream spacer (Fig. 1)
The 51 nonsense spacer sequences (Fig. 1) were adapted from a negative-control sgRNA library generated by Gilbert et al. (Gilbert et al., 2014). These sequences correspond to scrambled Cas9 spacer sequences, and we adjusted them slightly for length (20 nt) and GC content. All nonsense spacers are listed in Table S1.
Computation of GC content in sliding window (Figs. 1H-J, 2B)
For each of the nonsense spacer sequences, we computed the GC content in a sliding 5-nt window (first nucleotides 1-5, then nucleotides 2-6, etc.). For each such window, we then calculated the average and standard error of all 51 spacers. As the sliding window approached the 3’ end of the spacers, we reduced the size of the sliding window to 4, then 3, then 2 nucleotides, in order to increase resolution at the very 3’ end. This was performed also for naturally occurring spacers (Fig. 2B) and CRISPR separators (Fig. 2B). The spacers we analyzed varied in length from 25-36 nt. For this analysis, we truncated the 5’ ends of spacers longer than 25 nt. This way, we could align and analyze the 25 nucleotides at the most 3’ end of every spacer, even though it meant that we would lose information at the 5’ end of longer spacers. For the separator sequences, we first aligned them using the T-Coffee alignment tool (see below), which did not truncate any of the separator sequences.
Calculation of the predictive power of spacer GC content
For calculating the predictive power of knowing the GC content of 3 bases in the nonsense spacer (Fig. 1K), we divided each 20-nt spacer into 18 3-nt windows and calculated the GC content for each window. For each such window (e.g. nucleotides 1-3), we plotted GC content versus percent GFP+ cells for all 51 arrays. We then performed a linear regression (GraphPad Prism v. 9.0; GraphPad Software, San Diego, CA) and used the resulting R2 value for Fig. 1K.
Multiple sequence alignment of naturally occurring CRISPR sequences (Figs. 2, S2)
To find bacterial CRISPR-Cas12a operons, we used CRISPR-Cas++ (Couvin et al., 2018) using two search strategies: 1) Using the CRISPRCasdb-Blast tool with default settings (accessed September 2020), we input the Cas12a repeat sequences from Zetsche et al. (Zetsche et al., 2015) and extracted all spacers and repeats, making sure that the spacer sequences were all directed in the 5’-to-3’ direction. 2) Using the CRISPRCasdb function, we searched for Cas12a loci in all organisms using default settings. Alignment of separator sequences and post-processed repeats was performed using the multiple sequence alignment tools of SnapGene (v. 5.1-5.2). The separator sequences were aligned using the T-Coffee algorithm. The sequences used can be found in Table S2.
The type V-A Cas12a protein can process its CRISPR array, a feature useful for multiplexed gene editing and regulation. However, CRISPR arrays often exhibit unpredictable performance due to interference between multiple crRNAs. Here, we report that Cas12a array performance is hypersensitive to the GC content of crRNA spacers, as high-GC spacers can impair activity of the downstream crRNA. We analyzed naturally occurring CRISPR arrays and observed that repeats always contain an AT-rich fragment that separates crRNAs; we term this fragment a CRISPR separator. Inspired by this observation, we designed short, AT-rich synthetic separators (synSeparators) that successfully removed the disruptive effects between crRNAs. We demonstrate enhanced simultaneous activation of seven endogenous genes in human cells using an array containing the synSeparator. These results elucidate a previously unknown feature of natural CRISPR arrays and demonstrate how nature-inspired engineering solutions can improve multi-gene control in mammalian cells.