Materials and Methods

CRISPR interference and priming varies with individual spacer sequences

MATERIALS AND METHODSBacterial strainsStrains, plasmids and oligonucleotides used in this study are listed in Supplementary Tables S1–S4. E. coli K12 BW25113 was used as background for all constructed strains (43,44). Individual gene deletion strains for hns, cas3, cse1, cas1 were obtained from the Keio collection (44). The cas3, cse1 or cas1 deletions were moved from BW25113 Δcas3::kan, BW25113 Δcse1::kan or BW25113 Δcas1::kan into BW25113 Δhns or E. coli X019 using P1 phage transduction. Kanamycin resistance cassettes were removed for all strains using pCP20 (45), with the exception of the cas3::kan cassette in strain X030 (Supplementary Table S1). The X019 strain, in which spacers 1.2–1.12 and 2.2–2.6 were deleted was created using lambda Red recombinase, as previously described (43). DNA fragments containing kanamycin resistance cassette flanked by FRT sites with 50-bp homology to CRISPR 1 or CRISPR 2-adjacent sequences were created using primers XCY302–303 and XCY304–305, respectively (Supplementary Table S3).Plasmid constructionAll protospacers used in this study were ligated to pACYCDuet-1 or pCDF-1b between BgIII and XhoI or NcoI and NotI sites as indicated (Supplementary Tables S2 and S3). For recombinant Cascade expression with different spacer sequences, CRISPRs bearing spacers 1.1, 1.6, 1.9 or 2.1 were generated using pWUR547 (21) as template with primers indicated in Supplementary Tables S2 and S3 through ‘round-the-horn (RTH) cloning ('Round-the-horn_site-directed_mutagenesis). For recombinant Cascade expression and purification, the N-terminal Streptactin-tag in Cse2 of pWUR480 (14) plasmid was changed to a His6-tag using HL005 and HL006 through RTH cloning to produce pDGS010. For recombinant Cas3 expression and purification, the cas3 gene was amplified from the E. coli K12 genome using XCY001 and XCY002 and cloned into pSV272, encoding an N-terminal His6-MBP (maltose-binding protein) tag. For the Cas1 and Cas2 expression plasmid pX288, PCR product of native tac promoter controlled Cas1-Cas2 was cloned using XCY255 and XCY256 in pACYCDuet-1.For the pACYC-GFP-tac plasmid, a tac promoter (46) controlled GFP gene was PCR amplified from psfGFP using XCY413 and XCY417 and cloned into pACYCDuet-1 (Supplementary Tables S2 and S3). Our preliminary results indicated that the use of a native tac promoter for GFP expression slowed E. coli growth, changing the GFP+ and GFP- cells ratio through a non-CRISPR related mechanism. To solve this problem, the native tac promoter was changed to a weaker tac-derived promoter (47). The resulting plasmid (pACYC-GFP-pro3) did not increase the doubling time of E. coli compared to E. coli bearing empty pACYCDuet-1. Competition assays between E. coli X019 bearing either empty pACYCDuet-1 or pACYC-GFP-pro3 further indicated that GFP expression does not affect the ratio of GFP- and GFP+ cells, as ratios between the two strains remained constant after 24 h growth without selection. To decrease the half-life of GFP, the protease-sensitive SsrA peptide tag AANDENYALAA (48) was added to the C-terminus of GFP using XCY449 and XCY452 with pACYC-GFP-pro3 as template through RTH cloning. This final plasmid is referred to as pACYC-GFP throughout the text.CRISPR-like plasmid pRep-Sp8-Rep was created by cloning a PCR amplicon containing the first two repeats and one spacer (spacer #8) from E. coli PIM5 (28) into the PciI site of plasmid pGFP-Kan (36) (Supplementary Tables S2 and S3 for plasmids and primers). This plasmid then served as a template to create derivative plasmids pSp8-Rep (containing the distal repeat) and pRep-Sp8 (containing the proximal repeat). Plasmids were PCR amplified using Phusion DNA polymerase, 5′ phosphorylated using T4 polynucleotide kinase, end-to-end ligated, and transformed to E. coli XL1 Blue. pRep-Sp8 then served as a template for a series of 16 derivative plasmids (Supplementary Table S2), using primers listed in Supplementary Table S3. Plasmids sequences were confirmed by Sanger sequencing at GATC Biotech (Konstanz, Germany).pACYC-Cas3-C85Venus-Cse1-N155Venus (referred as pACYC-BiFC in the paper) was created by Gibson Assembly using primers XCY479–XCY486 (Supplementary Table S3) (49). Cas3 contains a C-terminal fusion of the C-terminal Venus fragment and Cse1 contains a C-terminal fusion of the N-terminal fragment of Venus. The expression vector was assembled from 4 separate PCR products amplified using either pACYC-Cas3-Cse1 or mVenus-pBAD vector (a gift from Michael Davidson (Addgene plasmid # 54845)) as template.Plasmid-loss and spacer acquisition experimentsPlasmids were introduced into E. coli BW25113 derived strains via heat shock and single colonies were used to inoculate initial cultures. All strains were grown for 24 h (sub-cultured at 12 h) in 2 ml LB in 15 ml tubes at 37ºC with shaking at 200 rpm. For passaging, 20 μl of culture was sub-cultured into 2 ml LB. When indicated, further periods of incubation were performed at the same conditions. E. coli cultures were diluted 250 000-fold and 10 μl of the final dilution was plated on LB plates (1.5% agar) without antibiotic. After 6 h, 35–50 colonies on these plates were replicated onto LB plates supplemented with chloramphenicol to check for plasmid loss. For each sample, 16 colonies on the no antibiotic plates were picked randomly to analyze spacer acquisition by colony PCR using Taq DNA polymerase. Newly acquired spacers in CRISPR 1 or CRISPR 2 were detected by PCR using primers XCY076–077 or XCY152–153, respectively (Supplementary Table S3). PCR products were visualized on 2% agarose gels stained with SYBR Safe (Thermo Fisher Scientific). All experiments were performed for three individual cultures. Plasmid loss and spacer acquisition rates reflect the average of these three biological replicates, and errors are the standard deviation between replicates.Plasmid loss experiments to assess autopriming were performed in E. coli strain PIM5 (28), which is a derivative of BW25113 Δhns. Plasmids were transformed into PIM5 by electroporation, and strains were grown for 48 h in non-selective liquid media. Plasmid loss was assessed on non-selective plates by scoring fluorescence of the colony resulting from the presence of the GFP plasmid. Non-fluorescent colonies were analyzed by colony PCR for the integration of new spacers in CRISPR 1 and 2, and sequenced to confirm the strand bias that is typical for priming in E. coli strains with Type I-E CRISPR–Cas systems (11).Protein expression and purificationCascade lacking Cse1 (Cse2–Cas6e) was expressed in BL21(DE3) cells using pDGS010, pWUR404 and the appropriate CRISPR expression plasmid (spacer 1.1: pX238, spacer 1.6: pX230, spacer 1.9: pX503, spacer 2.1: pX569, Supplementary Table S2) in 1 l LB media supplemented with ampicillin, chloramphenicol and streptomycin. Cultures were grown to 0.5 OD600 at 37°C, and induced overnight at 16°C with 0.5 mM IPTG. His6-tagged Cse2–Cas6e was purified using HisPur Ni-NTA affinity resin (Thermo Fisher Scientific). The eluent was concentrated to ∼1 ml, then purified by size exclusion chromatograph using a Superdex 200 column (GE Life Sciences) in a buffer containing 20 mM Tris (pH 7.5), 100 mM NaCl, 5% glycerol and 1 mM TCEP. Cse1 was expressed in BL21(DE3) using the EcCse1-pSV272 expression vector, and purified as previously described (30). Cas3 was expressed in BL21(DE3) using the Cas3-pSV272 expression vector (Supplementary Table S2) and purified as described previously with the following modifications (25). During the whole Cas3 purification process, 1mM TCEP was added in all buffers. To maintain the activity of Cas3, the purification process was completed in one day. Briefly, after lysis and affinity purification using HisPur Ni-NTA resin, His6-MBP-Cas3 was purified on a Superdex 200 column. The purified His6-MBP-Cas3 protein was cleaved by tobacco etch virus protease for 3 h at 4°C. The cleaved sample was flowed through a Ni-NTA column, concentrated to 1 ml, and finally purified on a Superdex 200 column.DNA binding and cleavage assaysAll binding assays were performed in binding buffer: 20 mM Tris (pH 7.5), 100 mM NaCl, and 5% glycerol. All cleavage assays were performed in reaction buffer: 10 mM HEPES (pH 7.5), 100 mM KCl, 5% (v/v) glycerol, 2 mM ATP, 100 μM CoCl2, and 10 mM MgCl2. Concentrations indicated for Cascade in Figure ​Figure1E,1E, ​,6A6A–C, Supplementary Figures S2A and S7B–D are for the Cse2–Cas6e complex, as Cse1 was held at a constant concentration to ensure complete formation of the Cascade complex (30). Cse2–Cas6e at indicated concentrations and 1000 nM Cse1 were pre-incubated for 20 min at 37°C to form the Cascade complex. Samples were cooled on ice for 1 min prior to initiating binding or cleavage reactions. For Cascade–DNA binding, Cascade was incubated with 2 nM target plasmid, and samples were incubated at 37°C for 30 min prior to electrophoresis on a 0.8% agarose gel stained with SYBR Safe run at 15 V at 4°C for 18 h. For Cas3 cleavage, Cascade was incubated with 2 nM target plasmid at 37°C for 15 min. Cas3 was added at the indicated concentration to initiate plasmid digestion. Reactions were incubated at 37°C for 30 min and terminated by the addition of 20 mM EDTA. Proteins were removed by phenol extraction. Reactions were analyzed by electrophoresis on a 1% agarose gel stained with SYBR Safe.Open in a separate windowFigure 6.Priming PAM blocks Cascade-mediated Cas3 cleavage but not Cas3-Cascade association. (A and B) Electrophoretic mobility shift assay for Cascade binding to (A) spacer 1.1 and (B) 2.1 targets with AAG or AGA PAMs. Cse2–Cas6e concentration is varied, and concentrations are labeled for each sample. Cse1 concentration was held constant at 1 μM to ensure complete formation of the Cascade complex. (C) Cascade-mediated Cas3 cleavage of spacer 2.1 targets with AAG or AGA PAMs. Plasmid DNA is labeled as follows: OC – open circle; L – linear; nSC – negatively supercoiled; D – degraded. (D–F) Confocal micrographs for BiFC experiments detecting interactions between Cse1 and Cas3. (D) E. coli BW25113 Δcse1Δcas3 grown with pACYC-BiFC and empty pCDF-1b plasmid. (E) E. coli BW25113 Δcse1Δcas3 grown with pACYC-BiFC and pCDF containing spacer 2.1 target with an AAG PAM. (F) E. coli BW25113 Δcse1Δcas3 grown with pACYC-BiFC and pCDF containing spacer 2.1 target with an AGA PAM.Generation of PAM and seed librariesTo avoid sequence bias, initial libraries were constructed in DH5α by RTH cloning using pACYC-GFP as template (see Supplementary Table S3 for primers). Primer locations were designed to avoid complementarity between the overhanging degenerate sequence and the template. However, this library design did not result in an unbiased library for the original spacer 1.1 seed library, so an alternative method was used for spacer 1.1 and 2.1 6MM seed library creation. A 24-bp protospacer (position 9 to position 32) of spacer 1.1 (XCY573 and XCY574) or spacer 2.1 (XCY577 and XCY578) (Supplementary Table S3) was ligated into pACYC-GFP to create pX735 and pX737, and these plasmids were used as templates for RTH cloning of the libraries. All primers were phosphorylated using polynucleotide kinase prior to PCR. Primers were used to PCR amplify the pACYC-GFP, pX735 or pX737 backbone and PCR products were purified. PCR products were ligated and transformed into E. coli DH5α. For each library, over 30,000 transformants were isolated. All colonies were resuspended in LB, the bacteria were pelleted, and plasmids were extracted using a Promega Wizard Plus SV Miniprep DNA Purification kit. This procedure yielded the five original libraries, PAM of spacer 1.1, seed of spacer 1.1, PAM of spacer 2.1, and two seed libraries of spacer 2.1. All libraries were prepared in triplicate from three separate ligations and DH5α transformations. High-throughput plasmid loss, priming assays and sequencing were performed for all three biological replicates.High-throughput plasmid loss and priming assaysAll original libraries were transformed into X019, X019 Δcse1 and X019 Δcas1 and plated onto LB plates with chloramphenicol yielding around 30 000 colonies. All colonies were resuspended using 1 ml LB. After adjusting the concentration of the resuspended bacteria to OD600 of ∼5.0, 20 μl of the culture was used to inoculate 2 ml LB without antibiotic for five growth cycles, with sub-culturing every 6 or 12 h. Next, 40 μl of the each culture was used to inoculate 4 ml LB supplemented with chloramphenicol. These cultures were grown at 37°C with shaking at 200 rpm for 12 h and plasmids were extracted, yielding an additional 12 plasmid libraries. These 12 plasmid libraries and the four original plasmid libraries were transformed into X019 and cultured for two cycles, sub-cultured at 6 or 12 h. The cultures were diluted 100-fold and analyzed by BD FACSAria III flow cytometer. For each culture, 100 000 GFP+ and GFP- cells were sorted. The average percentage of GFP-cells for three biological replicates are reported in Figure ​Figure2D,2D, and errors reflect the standard deviation between the three replicates. Spacer acquisition was analyzed for the genomic DNA of the sorted GFP- cells by PCR amplification using XCY076 and XCY077 for CRISPR 1 and XCY152 and XCY153 for CRISPR 2. PCR products were analyzed on a 2% agarose gel stained with SYBR Safe, and intensity of PCR bands was measured using ImageQuant TL (GE Life Sciences). Spacer acquisition rates were measured as the intensity of extended CRISPR PCR products relative to the intensity of total PCR product. The relative intensity of CRISPR 1 and CRISPR 2 were averaged to determine the relative spacer acquisition for each sample. Spacer acquisition rates reported in Figure ​Figure2E2E are the average from three separate biological replicates and error bars reflect standard deviation between the three replicates.Open in a separate windowFigure 2.High-throughput screen for CRISPR activity of PAM and seed mutants. (A–C) Experimental design for high-throughput screen. (A) PAM and seed library construction. PAM libraries contained completely degenerate sequences at the -3, -2 and -1 positions of the target, resulting in 64 possible sequences. Seed libraries contained two potential sequences at each positions 1–5 and 7–8 of the target, resulting in 128 possible sequences. (B) The original libraries were transformed to E. coli X019, X019 Δcas1 and X019 Δcse1 and libraries were prepared for each strain after an extended growth period in non-selective media. These libraries were used for barcoded PCR as experimental samples for high-throughput sequencing analysis. (C) All libraries were transformed to E. coli X019 and grown for 2 cycles of 6–12 h. Cells were sorted by FACS to measure rates of plasmid loss and the genomic DNA of GFP- cells were used for PCR of CRISPRs to determine rates of spacer acquisition. (D) Plasmid loss rates for libraries created using this high-throughput experimental design, as measured by percent of GFP- cells for ∼100 000 cells tested. (E) Spacer acquisition rates for the libraries.MiSeq Illumina sequencingThe PAM and seed libraries extracted from DH5α, X019, X019 Δcse1, and X019 Δcas1 were amplified by PCR using Phusion DNA polymerase using a pair of primers containing unique 6-nt barcodes to differentiate between libraries and replicates (Supplementary Table S4). The 100–120 bp PCR fragments were analyzed by 2% agarose gel electrophoresis and absorbance at 260 nm. Based on the gel analysis and absorbance reading, equal quantities were mixed and pooled. The mixed samples were run on a 2% agarose gel, the band was excised and purified using a Promega Wizard Gel and PCR Clean-up kit. Samples were analyzed on an Agilent 2100 Bioanalyzer and a Qubit Fluorometer (Thermo Fisher Scientific) to determine DNA size and concentration. Samples were prepared for Illumina Sequencing using the TruSeq Nano DNA Sample Preparation kit (v3) for 1× 150 bp (single-end). To increase the diversity of sequences, samples were spiked with ∼30% of a PhiX Control v3 adapter-ligated library. Samples were sequenced on an Illumina MiSeq at the Iowa State University DNA Facility.Plasmid libraries were sequenced in three separate MiSeq runs (Supplementary Table S5). MiSeq run 1 contained the three replicates each for the PAM and seed libraries for spacer 2.1. MiSeq run 2 contained three replicates for the PAM library for spacer 1.1 and three replicates for an incomplete seed library for spacer 1.1, which had 42 sequences with fewer than 100 reads in all libraries. Analysis of this library is included in Supplemental Data File 1, although this analysis was omitted in the main text. MiSeq run 3 contained three replicates each for the redesigned spacer 1.1 seed library and the spacer 2.1 6MM library.Analysis of MiSeq dataSequences from MiSeq output files were demultiplexed and sorted into separate files for each library and replicate based on the presence of specific pairs of barcodes at both ends of the read using a bash script (Supplementary Table S4). Reads corresponding to the target (forward reads) and non-target (reverse reads) strand of the protospacer were sorted separately. To determine read counts for all possible sequences in each library, the resulting files were searched for the 64 or 128 possible PAM/protospacer sequences for each PAM or seed library, respectively. An output file was generated for each replicate of each library containing the counts for each PAM/protospacer search sequence in the forward and reverse direction (compiled in Supplemental Data File 1). Forward read counts for highly depleted sequences in the X019 Δcas1 and X019 library were systematically higher than reverse read counts for the same sequences. This phenomenon does not appear to be a result of the demultiplexing strategy, as demultiplexing using alternative methods (fastx-multx command from ea-utils package (50) or sabre ( produced very similar results. Overall trends in sequence depletion are the same between forward and reverse reads, although the absolute value of counts differs. Therefore, forward and reverse read counts were summed and treated as total read counts for each sequence. Read counts between samples were normalized by calculating a scaling factor based on the sample with the highest number of sequences. For seed libraries, sequences with anomalously high read counts (>2-fold greater than the DH5α reference library following normalization) in the X019 or X019 Δcas1 were omitted when calculating the scaling factor. Normalized read counts from three biological replicates for each library were averaged and standard deviations were determined. To determine the relative number of counts for each sequence in the experimental libraries, average read counts for X019 Δcse1, X019 Δcas1, and X019 libraries were divided by the average read count for the DH5α reference library. Standard deviations were propagated and are reported as errors for relative counts in Figures ​Figures3B3B–C, ​,4A4A–B, Supplementary Figures S4 and S6.Open in a separate windowFigure 3.One or two mismatches in the seed can be tolerated for CRISPR activity. (A) Seed libraries tested in this study. The crRNA seed sequence and corresponding region of the non-target strand of the protospacer are shown. Degenerate DNA labels: Y – cytosine or thymine; R – adenine or guanine; S – guanine or cytosine. (B and C) Counts of sequences with one or two mismatches in the seed sequence for X019 Δcse1, X019 Δcas1, and X019 relative to the reference DH5α library. (B) Spacer 1.1 seed libraries. (C) Spacer 2.1 seed libraries. Mismatch position(s) are labeled for each set of data.Open in a separate windowFigure 4.Direct interference and priming can be promoted by a large set of PAM sequences. (A and B) Scatter plots for relative counts of 64 PAM/Protospacer sequence for libraries extracted from E. coli X019 versus E. coli X019 Δcas1 for (A) spacer 1.1 targets and (B) spacer 2.1 targets. Counts are relative to the E. coli DH5α reference library. (C and D) PAM sequences colored by groups as defined in (A and B) for (C) spacer 1.1 targets and (D) spacer 2.1 targets. Red: Group A, blue: Group B, purple: Group C, black: Group D. (E) Plasmid loss and spacer acquisition rates for spacer 1.1 and 2.1 targets with AAA, AAC, ATA or AGA PAMs after 24 h growth.Fluorescence microscopyBiFC experiments were performed in E. coli X030 (Supplementary Table S1) carrying pACYC-BiFC and empty pCDF-1, pCDF-1b bearing spacer 2.1 AAG target, or pCDF-1b bearing spacer 2.1 AGA target. Single colonies were grown at 37°C in LB containing chloramphenicol (34 ug/ml) until OD600 reached 0.05. Cultures were shifted to 18°C for 6 h to ensure that plasmid loss of the pCDF-1b bearing spacer 2.1 AAG target would occur slowly, allowing for fluorescence to be observed. Cells were adjusted to OD600 0.5 and re-suspended in phosphate buffer (pH 7.2) and 5 μl of the cells were applied to poly-l-lysine covered microscope slides, and analyzed using a Leica SP5 X MP confocal/multiphoton microscope system with an inverted microscope front end, with a 40× oil immersion objective and an argon laser as the excitation source (514 nm) and detection at 530–600 nm.Analysis of naïve PAMs and simulation of adaptationA data set generated by Yosef et al. was used to analyze the frequency of PAMs of targets for spacers acquired through naïve adaptation (29). Spacer sequences and genomic or plasmid locations were reported in the original paper. PAM sequences were extracted from the genomic (NCBI Reference sequence {"type":"entrez-nucleotide","attrs":{"text":"NC_012947.1","term_id":"253771435","term_text":"NC_012947.1"}}NC_012947.1) or plasmid sequences (reported in (10)) using the BEDtools getfasta tool (51).In scenario 1, pX288 (pACYC-Cas1–2) and empty pCDF-1b were co-transformed into the X019 Δcse1 E. coli strain. In scenario 2, pX288 and priming plasmid (pCDF-1b bearing a spacer 2.1 target with an AGA PAM) were co-transformed into the X019 strain. In scenario 3, pX288 and empty pCDF-1b were co-transformed into the X019 strain. Single colonies were grown to saturation (OD600 of 3.5) for two cycles in LB supplemented with chloramphenicol to maintain pX288. The 5′-end of CRISPR 1 was PCR amplified from genomic DNA isolated from each culture using XCY076 and XCY077 (Supplementary Table S3) and visualized on a 2% agarose gel stained with SYBR Safe to test for spacer acquisition. The relative amount of each band corresponding to a different number of acquired spacers was determined by densitometry using ImageQuant TL software. Cultures were performed in triplicate and the average amount of product is plotted in Figure ​Figure7B,7B, with error bars reflecting the standard deviation between replicates.Open in a separate windowFigure 7.Naïve adaptation triggers a rapid priming response. (A) Analysis of PAM sequences for spacers acquired in Yosef et al. naïve adaptation study (29). In the study, spacers were acquired from the E. coli genomic DNA or a plasmid borne by host. Percentage of reads for spacers derived from sequences with AAG PAMs or other functional or nonfunctional PAMs identified in our study are plotted. Total distribution of each type of PAM in each source DNA are also plotted. (B) Quantified PCR product resulting from newly acquired spacer from three adaptation scenarios following two cycles of growth. Scenario 3 products with significant differences (P < 0.005 based on unpaired two-tailed t-test, n= 3 cultures) compared to scenario 1 are marked with an asterisk. (C and D) Model for adaptation during initial encounter of invader DNA. Naïve adaptation, requiring only the adaptation machinery (orange), allows for integration of spacers against the previously unencountered virus. Spacers may be against targets with PAMs that promote (C) interference or (D) priming by the interference machinery (blue). Cascade bearing newly acquired spacers can bind targets with (C) interference or (D) priming PAMs and recruit Cas3. PAM licensing at this step elicits a (C) target degradation or (D) priming response, although rare occurrences of the alternative mechanism are also possible for each type of target.

Article TitleCRISPR interference and priming varies with individual spacer sequences


CRISPR–Cas (clustered regularly interspaced short palindromic repeats-CRISPR associated) systems allow bacteria to adapt to infection by acquiring ‘spacer’ sequences from invader DNA into genomic CRISPR loci. Cas proteins use RNAs derived from these loci to target cognate sequences for destruction through CRISPR interference. Mutations in the protospacer adjacent motif (PAM) and seed regions block interference but promote rapid ‘primed’ adaptation. Here, we use multiple spacer sequences to reexamine the PAM and seed sequence requirements for interference and priming in theEscherichia coliType I-E CRISPR–Cas system. Surprisingly, CRISPR interference is far more tolerant of mutations in the seed and the PAM than previously reported, and this mutational tolerance, as well as priming activity, is highly dependent on spacer sequence. We identify a large number of functional PAMs that can promote interference, priming or both activities, depending on the associated spacer sequence. Functional PAMs are preferentially acquired during unprimed ‘naïve’ adaptation, leading to a rapid priming response following infection. Our results provide numerous insights into the importance of both spacer and target sequences for interference and priming, and reveal that priming is a major pathway for adaptation during initial infection.

Login or Signup to leave a comment
Find your community. Ask questions. Science is better when we troubleshoot together.
Find your community. Ask questions. Science is better when we troubleshoot together.

Have a question?

Contact or check out our support page.