Plasmid protection assay
All plasmid protection assays were done in Escherichia coli (strain: NEB Turbo). As described previously12, SpyCas9 was expressed via the arabinose-inducible promoter pBAD on a CloDF13-based plasmid marked with a spectinomycin resistance cassette. The SpyCas9 construct, called pSpyCas9crA, was designed to eliminate a target vector with a kanamycin resistance cassette. This target vector also expressed a gene-of-interest (_e.g., an acr) via the doxycycline-inducible pLtetO-1 promoter (Supplemental Table 4). We induced expression from the target vector via depression of the TetR transcription factor with doxycycline (we generically named this vector pZE21_tetR; Supplemental Table 4). IPTG was used in samples with the target vector to ensure high levels of TetR expression (which was driven by the lac promoter) and thus inducible control of our gene of interest. Unless noted in Supplemental Table 5, all genes, including each alanine mutant depicted in Figure 6A, were synthesized by Synbio technologies and cloned directly into pZE21_tetR for functional testing.
Cultures of each sample were grown overnight at 37°C with shaking at 220 rpm in lysogeny broth (LB; 10 g/L casein peptone, 10 g/L NaCl, 5 g/L ultra-filtered yeast powder) containing spectinomycin 50 µg/ml, kanamycin 50 µg/ml, and 0.5mM IPTG. These growth conditions kept both SpyCas9 and the gene of interest in uninduced states. The next morning, overnight cultures were diluted 1:50 into LB broth containing spectinomycin (at 50 µg/ml), kanamycin (at 50 µg/ml), 0.5mM IPTG, and doxycycline 100 ng/ml to induce the gene of interest. Cultures were grown at 37°C on a roller drum to mid-log phase (for approximately 1.5 hours to OD600 of 0.3-0.6). Once cells reached mid-log phase, they were diluted to OD600 value of 0.01 into two media types: (a) LB containing spectinomycin 50 µg/ml, 0.5mM IPTG, and doxycycline 100 ng/ml, and (b) LB containing spectinomycin 50 µg/ml, 0.5mM IPTG, doxycycline 100 ng/ml, and 0.2% (L) arabinose. These media induced either the gene of interest alone, or both the gene of interest and SpyCas9, respectively. Each sample was grown in triplicate in a 96 well plate in a BioTek Cytation 3 plate reader. After 6 hours of growth at 37°C with shaking at 220 rpm, each sample was diluted ten-fold and plated on two types of media: (a) LB spectinomycin 50 µg/ml + 0.5mM IPTG or (b) LB spectinomycin 50 µg/ml, kanamycin 50 µg/ml, 0.5mM IPTG. Plates were incubated at 37°C overnight. Then, colonies were counted to determine the fraction of colony forming units (cfus) that maintained kanamycin resistance (and thus the target vector). All figures depicting these data show the log-transformed proportion of KanR/total cfu, with or without SpyCas9 induction. The growth curves in Supplemental Figure 1 match the experiment depicted in Figure 1C for the uninduced SpyCas9 samples. For the uninduced orf_1 control samples, doxycycline was omitted from media throughout the experiment. Growth rates referenced in the text and in Supplemental Figure 1 were calculated using the slope of the OD600 growth curves during log phase, following a natural log transformation.
To test AcrIIA22 function against a panel of Cas9 and Cas12 orthologs in Figure 3C, we used a slightly modified, three-plasmid setup. As before, spyCas9, nmCas9, fnCas12 and lbCas12 were encoded in a CloDF13-based plasmid with a spectinomycin resistance cassette. Expression of the Cas effector was controlled by promoter J23100 and a theophylline riboswitch. The accompanying gRNAs were encoded in a separate set of plasmids called pDual4 under an arabinose expression system, in a p15A-based plasmid and a chloramphenicol resistance cassette (Supplemental Table 4). The gRNAs in the different pDual4 constructs were programmed to target the kanamycin-marked target plasmid in the same manner as pSpyCas9crA. All assays were done in _Escherichia coli (strain: NEB Turbo) following the same plasmid protection assay described previously. However, in this case, we induced expression of the different Cas effectors and gRNAs, by adding 2 mM theophylline and 0.2% (L) arabinose, respectively, to the media.
Impact of AcrIIA22 on GFP expression
We swapped spyCas9 for egfp in our CloDF13-based plasmid and co-expressed AcrIIA22 to determine if AcrIIA22 impacted expression from this construct. If AcrIIA22 influenced CloDF13’s copy number or the transcription of spyCas9, we anticipated that it would also impact GFP levels in this construct (pCloDF13GFP; Supplemental Table 4). To perform this experiment, we co-transformed pCloDF13_GFP and pZE21_tetR encoding _acrIIA22 into E. coli Turbo. Single colonies were picked into 4 ml of LB containing spectinomycin at 50 µg/ml (‘spec50’) and kanamycin at 50 µg/ml (‘kan50’) and 0.5mM IPTG and grown overnight at 37°C shaking at 220rpm. The next morning the overnight culture was diluted 1:50 into both LB spec50 Kan50 + 0.5mM IPTG with or without doxycycline (to induce acrIIA22) and grown at 37°C for about 1.5 hours to mid-log phase (OD600 0.2-0.6). The OD600 was measured, and all samples were diluted to OD600 of 0.01 in two media types: (a) LB spec50 + kan50 + 0.5mM IPTG + 0.2% arabinose (inducing gfp only) or (b) LB spec50 + kan50 + 0.5mM IPTG + 0.2% arabinose + 100ng/ml doxycycline (inducing gfp and acrIIA22). A volume of 200 µl of each sample was then transferred to a 96-well plate in triplicate and GFP fluorescence was measured every 15 minutes for 24 hours (GFP was excited using 485 nm light and emission detected via absorbance at 528 nm). In parallel, we included control samples that lacked the kanamycin-marked plasmid and varied whether doxycycline was added or not (at 100 ng/ml). In these control samples, we noticed that doxycycline slightly diminished GFP expression (it is possible that sub-lethal levels of the antibiotic may still depress translation). Thus, we normalized GFP fluorescence measurements in our experiment with AcrIIA22 to account for this effect in all samples containing doxycycline. These normalized fluorescence measurements are shown in Supplemental Figure 2B.
Western blots to determine AcrIIA22’s impact on SpyCas9 expression
We grew overnight cultures of E. coli Turbo that expressed pSpyCa9crNT and pZE21_tetR encoding a gene of interest (Supplemental Tables 4, 5) in LB spec50 + kan50 + 0.5mM IPTG. The next morning, we diluted these cultures 1:100 in 4ml of either (a) LB spec50 + kan50 + 0.5mM IPTG or (b) LB spec50 + kan50 + 0.5mM IPTG + 100 ng/ml doxycycline (to induce the gene of interest). We included samples that expressed either _acrIIA22 or gfp as a gene of interest. In all SpyCas9 constructs, we used a crRNA that did not target our plasmid backbone (pSpyCa9crNT) to ensure that _acrIIA22 expression remained high and its potential impact on SpyCas9 expression levels would be most evident. All samples were grown for two hours at 37°C to reach mid-log phase (OD600 0.3 to 0.5) and transferred into media that contained 0.2% arabinose to induce SpyCas9. At transfer, volumes were normalized by OD600 value to ensure that an equal number of cells were used (diluted to a final OD600 of 0.05 in the arabinose-containing medium). This second medium either contained or lacked 100 ng/ml doxycycline to control expression of acrIIA22 or gfp, as with the initial media. Throughout this experiment, we included a control strain that lacked pZE21_tetR and only expressed SpyCas9. Kanamycin and doxycycline were omitted from its growth media. For this control strain, we also toggled the addition of arabinose in the second growth medium to ensure that positive and negative controls for SpyCas9 expression were included in our experiment. After three hours and six hours of SpyCas9 induction, OD600 readings were again taken and these values used to harvest an equal number of cells per sample (at three hours, OD600 values were between 0.76 and 0.93 and 0.75ml to 0.9ml volumes harvested; at six hours 0.4ml was uniformly harvested as all absorbance readings were approximately 1.6).
All samples were centrifuged at 4100g to pellet cells, resuspended in 100 µl of denaturing lysis buffer (12.5 mM Tris-HCl, pH 6.8; 4% SDS), and passed through a 25 gauge needle several times to disrupt the lysate. Samples were then boiled at 100°C for 10 minutes, spun at 13,000 rpm at 4°C for 15 minutes and the supernatants removed and frozen at -20°C. The next day, 12 µl of lysate was mixed with 4 µl of 4x sample buffer (200 mM Tris-HCl, 8% SDS, 40% glycerol, 200 mM DTT, and 0.05% bromophenol blue) and boiled at 100°C for 10 minutes. Then, 10 µl sample was loaded onto a BioRad Mini-Protean “any KD Stain Free TGX” gel (cat. #4569035) and run at 150V for 62 minutes. To verify that equivalent amounts of each sample were run, gels were visualized on a BioRad chemidoc for total protein content. Protein was then transferred to a 0.2 µM nitrocellulose membrane using the Bio-Rad Trans-Blot Turbo system (25 V, 1.3 A for 10 min). We then washed membranes in PBS/0.1% Triton-X before incubating them with a mixture of the following two primary antibodies, diluted in in LI-COR Odyssey Blocking Solution (cat. #927–40000): (i) monoclonal anti-SpyCas9, Diagenode cat. #C15200229-50, diluted 1:5,000; (ii) polyclonal anti-GAPDH, GeneTex cat. # GTX100118, diluted 1:5,000. The GAPDH antibody served as a loading control and a second check to ensure equal protein levels were run. Membranes were left shaking overnight at 4°C, protected from light. Then, membranes were washed four times in PBS/0.1% Triton-X (ten-minute washes) before they were incubated for 30 minutes at room temperature with a mixture of secondary antibodies conjugated to infrared dyes. Both antibodies were diluted 1:15,000 in LI-COR Odyssey Blocking Solution. To detect SpyCas9, the following secondary antibody was used: IR800 donkey, anti-mouse IgG, LI-COR cat# 926– 32212. To detect GAPDH, IR680 goat, anti-rabbit IgG, LI-COR cat# 926-68071 was used. Blots were imaged on a LI-COR Odyssey CLx after three additional washes.
Phage plaquing assay
We grew overnight cultures of E. coli Turbo expressing pSpyCa9crMu and pZE21_tetR encoding a gene of interest (Supplemental Tables 4, 5) at 37°C in LB spec50 + kan50 + 0.5 mM IPTG. Genes of interest were either _acrIIA4, gfp, or acrIIA22. The pSpyCas9 construct targeted phage Mu and was previously demonstrated to confer strong anti-phage immunity in this system12. A control strain expressing pZE21-tetR-gfp and SpyCas9_crNT (which encoded a CRISPR RNA that does not target phage Mu) was grown similarly. The next morning, all cultures were diluted 50-fold into LB spec50 + kan50 + 0.5 mM IPTG + 5 mM MgCl2 and grown at 37°C for three hours. Then, doxycycline was added to a final concentration of 100 ng/ml to induce the gene of interest. Two hours later, SpyCas9 was induced by adding a final concentration of 0.2% w/v arabinose. Two hours after that, cultures were used in soft-agar overlays on one of two media types, discordant for arabinose, to either maintain SpyCas9 expression or let it fade as arabinose was diluted in top agar and consumed by the host bacteria (per Supplemental Figure S2). Top and bottom agar media were made with LB spec50 + kan50 + 0.5 mM IPTG + 5 mM MgCl2. In cases where SpyCas9 expression was maintained, arabinose was also added at a final concentration of 0.02% to both agar types. Top agar was made using 0.5% Difco agar and bottom agar used a 1% agar concentration. For the plaquing assay, 100 µl of bacterial culture was mixed with 3 ml of top agar, allowed to solidify, and ten-fold serial dilutions of phage Mu spotted on top using 2.5 µl droplets. After the droplets dried, plates were overturned and incubated at 37°C overnight before plaques were imaged the following day.
Identification of AcrIIA22 homologs and hypervariable genomic islands
We searched for AcrIIA22 homologs in three databases: NCBI nr, IMG/VR, and a set of assembled contigs from 9,428 diverse human microbiome samples18. Accession numbers for the NCBI homologs are indicated on the phylogenetic tree in Figure 3A. We retrieved AcrIIA22 homologs via five rounds of an iterative PSI-BLAST search against NCBI nr performed on October 2nd, 2017. In each round of searching, at least 90% of the query protein (the original AcrIIA22 hit) was covered, 88% of the subject protein was covered, and the minimum amino acid identity of an alignment was 23% (minimum 47% positive residues; e-value ≤ 0.001). Only one unique AcrIIA22 homolog was identified in IMG/VR (from several different phage genomes) via a blastp search against the July, 2018 IMG/VR proteins database (using default parameters). This homolog was also found in other databases and its amino acid sequence is identical to that of AcrIIA22b (Figure 3A).
Most unique AcrIIA22 homologs were identified in the assembly data of over 9,400 human microbiomes performed by Pasolli and colleagues18. These data are grouped into multiple datasets: (i) the raw assembly data, and (ii) a set of unique species genome bins (SGBs), which were generated by first assigning species-level phylogenetic labels to each assembly and then selecting one representative genome assembly per species. We identified AcrIIA22 homologs using several queries against both databases. First, we performed a tblastn search against the SGB database using the AcrIIA22 sequence as a query, retrieving 141 hits from 137 contigs. A manual inspection of the genome neighborhoods for these hits revealed that most homologs originated from a short, hypervariable genomic island; some homologs were encoded by prophages. No phage-finding software was used to identify prophages; they were apparent from a manual inspection of the gene annotations that neighbored acrIIA22 homologs (see the section entitled “Annotation and phylogenetic assignment of metagenomic assemblies” for details).
To find additional examples of AcrIIA22 homologs and of these genomic islands, we then queried the full raw assembly dataset. To do so without biasing for acrIIA22-encoding sequences, we used the purF gene that flanked acrIIA22-encoding genomic islands as our initial query sequence. Specifically, we used the purF gene from contig number 1 in Supplemental Table 3; its sequence is also in Supplemental Table 5. To consider only the recent evolutionary history of this locus, we required all hits have ≥98% nucleotide identity and required all hits to be larger than 15 kilobases in length to ensure sufficient syntenic information. From these contigs, we further filtered for those that had ≥98% nucleotide identity to radC, the gene which flanked the other end of acrIIA22-encoding genomic islands. Again, we used the variant from contig number 1 in Supplemental Table 3; its sequence is also in Supplemental Table 5. In total, this search yielded 258 contig sequences; nucleotide sequences and annotations for these contigs are provided in Supplemental Dataset 5. We then searched for acrIIA22 homologs in these sequences using tblastn, again observing them in genomic islands and prophage genomes (which were assembled as part of the 258 contigs). In total, this search revealed 320 acrIIA22 homologs from 258 contigs. The 258 genomic islands from these sequences were retrieved manually by extracting all nucleotides between the purF and radC genes. These extracted sequences were then clustered at 100% nucleotide identity with the sequence analysis suite Geneious Prime 2020 v1.1 to identify 128 unique genomic islands.
- View inline
- View popup
Supplemental Table 3.
All sequences used in this study. Sequence names and databases are indicated. All sequences and annotations are also available as supplemental data. Sequences retrieved from Pasolli et al. refer to the following study: (Pasolli et al., 2019).
- View inline
- View popup
Supplemental Table 4.
Plasmids used in this study. Supplemental Table S5 indicates genes expressed from pZE21_tetR.
- View inline
- View popup
Supplemental Table 5.
Gene sequences used in this study.
Altogether, our two searches yielded 461 AcrIIA22 sequences from these metagenomic databases that spanned 410 contig sequences. The 461 AcrIIA22 homologs broke down into two groups: 410 clustered with genomic island-like sequences whereas 51 clustered with prophage-like homologs. In nature, the relative prevalence of AcrIIA22 in genomic islands or prophages may not be accurately reflected by these numbers because we never directly searched for prophage-encoded homologs. We then combined these 461 AcrIIA22 sequences with those from NCBI and IMG/VR and clustered the group on 100% amino acid identity to reveal 30 unique proteins. To achieve this, we used the software cd-hit49 with the following parameters: -d 0 -g 1 -aS 1.0 -c 1.0. These 30 sequences were numbered to match one of their parent contigs (as indicated in Supplemental Table 3) and used to create the phylogenetic tree depicted in Figure 3A. For AcrIIA22 homologs found outside NCBI, the nucleotide sequences and annotations of their parent contigs can be found in Supplemental Datasets 1 and 2. For NCBI sequences, accession numbers are shown in Figure 3A. The gene sequences used in functional assays (Figure 3B) have been reprinted in Supplemental Table 5, for convenience.
Annotation and phylogenetic assignment of metagenomic assemblies
Contig sequences from IMG/VR, the Pasolli metagenomic assemblies, and some NCBI entries lacked annotations, making it difficult to make inferences about acrIIA22’s genomic neighborhood. To facilitate these insights, we annotated all contigs as follows. We used the gene-finder MetaGeneMark50 to predict open reading frames (ORFs) using default parameters. We then used their amino acid sequences in a profile HMM search with HMMER351 against TIGRFAM52 and Pfam53 profile HMM databases. The highest scoring profile was used to annotate each ORF. We annotated these contigs to facilitate genomic neighborhood analyses for acrIIA22; these are not intended to provide highly accurate functional predictions of their genes. Thus, we erred on the side of promiscuously assigning gene function; our annotations should therefore be treated with appropriate caution. A visual inspection of these annotated contigs made apparent several examples of acrIIA22-encoding prophages (we noticed 35-40 kilobase insertions in some contigs that were otherwise nearly identical to those without prophages). We were confident that these insertions were prophages because they contained mostly co-linear genes with key phage functions annotated. As a simple means to sample this phage diversity, we manually extracted nine examples of these prophage sequences (their raw sequences and annotated genomes can be found in Supplemental Datasets 3 and 4). Annotations were imported into the sequence analysis suite Geneious Prime 2020 v1.1 for manual inspection of genome neighborhoods.
We used the genome taxonomy database (GTDB) convention for all sequences discussed in this manuscript54. In part, this was because all acrIIA22 genomes are found in clostridial genomes, which are notoriously polyphyletic in NCBI taxonomies (for instance, species in the NCBI genus Clostridium appear in 121 GTDB genera and 29 GTDB families)55. All SGBs that we retrieved from the Pasolli assemblies were assigned taxonomy as part of that work and were called Clostridium sp. CAG-217. Similarly, NCBI assemblies that encoded the most closely acrIIA22 homologs to our original hit were assigned to the GTDB genus CAG-21754,55. The raw assembly data from the Pasolli database was not assigned a taxonomic label but was nearly identical in nucleotide composition to the CAG-217 contigs (Figure 2, Supplemental Figure 4, Supplemental Datasets 1 and 2). Therefore, we also refer to these sequences as originating in CAG-217 genomes but take care to indicate which assemblies have been assigned a rigorous taxonomy and which ones for which taxonomy has been inferred in this fashion (Supplemental Table 3).
Comparing genes in genomic islands to phage genomes
We first examined the annotated genes within each of the 128 unique genomic islands. Manual inspection revealed 54 unique gene arrangements that differed in gene content and orientation. We then selected one representative from each arrangement and extracted amino acid sequences from each encoded gene (n=506). Next, we collapsed these 506 proteins into orthologous groups by clustering at 65% amino acid using cd-hit with the following parameters: - d 0 -g 1 -aS 0.95 -c 0.65. These cluster counts were used to generate the histogram depicted in Supplemental Figure 4C. To determine which protein families may also be phage-encoded, we queried the longest representative from each cluster with at least two sequences against the database of nine CAG-217 phages described in the section entitled “Annotation and phylogenetic assignment of metagenomic assemblies”. We used tblastn with default parameters to perform this search, which revealed that some proteins in the CAG-217 genomic islands have homologs in prophage genomes that are out-of-frame with respect to the MetaGeneMark annotations depicted in Supplemental Figure 4A.
Phylogenetic tree of AcrIIA22 homologs
The 30 unique AcrIIA22 homologs we retrieved were used to create the phylogeny depicted in Figure 3A. These sequences were aligned using the sequence alignment tool in the sequence analysis suite Geneious Prime 2020 v1.1. This alignment is provided as Supplemental Dataset 6. From this alignment, the phylogenetic tree in Figure 3A was generated using PhyML with the LG substitution model56 and 100 bootstraps. Coloration and tip annotations were then added in Adobe Illustrator.
Identification of CRISPR-Cas systems and Acrs in CAG-217 assemblies
To determine the type and distribution of CRISPR-Cas systems and Acrs in CAG-217 genomes, we downloaded all assembly data for the 779 SGBs assigned to CAG-217 in Pasolli et. al18 (bin 4303). We then predicted CRISPR-Cas systems for all 779 assemblies in bulk using the command line version of the CRISPR-Cas prediction suite, cctyper57. Specifically, we used version 1.2.1 of cctyper with the following options: --prodigal meta --keeptmp. To identify type II-A Acrs, we first downloaded representative sequences for each of the 21 experimentally confirmed type II-A Acrs from the unified resource for tracking anti-CRISPRs58. We then used tblastn to query these proteins against the 779 CAG-217 genome bins and considered any hit with e-value better than 0.001 (which included all hits with >30% identity across 50% of the query). To check if these Acrs were present in _acrIIA22-encoding phages, we performed an identical tblastn search, but this time using the set of nine acrIIA22-encoding prophages as a database.
Recombinant protein overexpression and purification
The AcrIIA22 protein and its mutants were codon optimized for E. coli (Genscript or SynBio Technologies) and the gene constructs were cloned into the pET15HE or pET15b plasmid12 to contain an N-terminal, thrombin-cleavable 6XHistidine (His6) tag. These plasmids differ by only a few bases just upstream of the N-terminal thrombin tag. For purified, twin-strep tagged proteins, constructs were cloned into a modified pET15b that lacks the N-terminal tag but instead has a C-terminal twin-strep tag (Supplemental Table 4). Constructs were transformed and overexpressed in BL21 (DE3) RIL or BL21 (DE3) pLysS E. coli cells. A 10 mL overnight culture (grown in LB + 100 µg/mL ampicillin) was diluted 100-fold into the same media and grown at 37°C with shaking to an OD600 of 0.8 for His6-tagged constructs and 0.3 for twin-strep-tagged constructs. Expression was then induced with 0.5 mM IPTG. For His6-tagged constructs, the culture was shaken for an additional 3 hours at 37°C; twin-strep-tagged constructs were induced at 16C for 22 hours. Cells were harvested by centrifugation and the pellet stored at -20°C.
Cell pellets for His6-tagged constructs were resuspended in 25 mM Tris, pH 7.5, 300 mM NaCl, 20 mM imidazole; twin-strep tagged constructs were resuspended in Tris 100nM 8.0 pH, 150mM NaCl, 1mM EDTA. Cells were lysed by sonication on ice. The lysate was centrifuged in an SS34 rotor at 18,000 rpm for 25 minutes, followed by filtering through a 5 µm syringe filter for the His6-tagged constructs and a 0.45 µM syringe filter for the twin-strep-tagged constructs.
To purify His6-tagged constructs, the clarified lysate was bound using the batch method to Ni-NTA agarose resin (Qiagen) at 4°C for 1 hour. The resin was transferred to a gravity column (Biorad), washed with >50 column volumes of Lysis Buffer, and eluted with 25 mM Tris, pH 7.5, 300 mM NaCl, 200 mM imidazole. The protein was diluted with 2 column volumes of 25 mM Tris, pH 7.5 and purified on a HiTrapQ column (GE Healthcare) using a 20 mL gradient from 150 mM to 1 M NaCl in 25 mM Tris, pH 7.5. Peak fractions were pooled, concentrated, and buffer exchanged into 200 mM NaCl, 25 mM Tris, pH 7.5 using an Amicon Ultra centrifugal filter with a 3,000 molecular weight cutoff (Millipore, UFC900324), then cleaved in an overnight 4°C incubation with biotinylated thrombin (EMD Millipore). Streptavidin agarose slurry (Novagen) was incubated with cleaved protein at 4°C for 30 minutes to remove thrombin. The sample was then passed through a 0.22 µm centrifugal filter and loaded onto a HiLoad 16/60 Superdex 200 prep grade size exclusion column (Millipore Sigma) equilibrated in 25 mM Tris, pH 7.5, 200 mM NaCl. The peak fractions were pooled, concentrated, and confirmed for purity by SDS-PAGE before use in most assays. Figure 4B depicts size exclusion chromatography data generated for thrombin-cleaved AcrIIA22 variants generated using a Superdex75 16/60 (GE HealthCare) column with 25 mM Tris, pH 7.5, 200mM NaCl. To correlate nicking activity with protein content across fractions (Supplemental Figure 10B), we collected 13 fractions that span the entire elution peak as well as fractions without AcrIIA22 protein. The protein gel shown in Supplemental Figures 10A and 10B was loaded with 5ul of each concentrated fraction.
For two additional proteins, we also performed similar Ni-NTA-based purifications of His6-tagged constructs, with small deviations from the protocol described in the preceding paragraph. Recombinant AcrIIA4 was purified similarly to other His6-tagged Acr proteins but with the following deviations, as previously described12. IPTG was used at 0.2 mM and cells were harvested after 18 hours of induction at 18°C. Thrombin cleavage also occurred at 18°C. This untagged version was used to help generate Supplemental Figure 6. Peak fractions for all proteins were pooled, concentrated, flash frozen as single-use aliquots in liquid nitrogen, and stored at −80°C. SpyCas9 was expressed in E. coli from plasmid pMJ806 (addgene #39312) to contain a TEV-cleavable N-terminal 6XHis-MBP tag and was purified as described previously12 with sequential steps of purification consisting of Ni-NTA affinity chromatography, TEV cleavage, Heparin HiTrap chromatography, and SEC. The protein was stored in a buffer consisting of 200 mM NaCl, 25 mM Tris (pH 7.5), 5% glycerol, and 2 mM DTT. Again, peak fractions were pooled, concentrated, and flash frozen as single-use aliquots.
We also purified AcrIIA22 and AcrIIA4 constructs with a C-terminal twin-strep tag. The protein was expressed and lysed as described above and purified according to the manufacturer’s guidelines (IBA Life Sciences, Inc.). Clarified lysates were passed over Strep-Tactin-Sepharose resin using a gravity filtration column. The flow through was passed over the resin a second time. The column was washed with a minimum of 20 column volumes of buffer W, followed by elution in buffer E (150 mM NaCl, 100 mM Tris, pH 8.0 mM, 1 EDTA, 2.5 mM desthiobiotin). The eluted protein was purified over a HiTrapQ column (GE Healthcare) using a 40 mL gradient from 150 mM to 0.5 M NaCl in 25 mM Tris, pH 7.5. Peak fractions were pooled and then purified again via size exclusion chromatography with a Biorad Enrich SEC650 10×300mm column in 150mM NaCl, 25 mM Tris, pH 7.5. These elution data are shown for AcrIIA22 and its variants in Figure 6B. Fractions were collected across the elution peak and confirmed for purity via silver stain (Supplemental Figure 10E), per manufacturer’s recommendations (Thermo Fisher Cat. No. 24612). For these proteins, we chose fraction number four to carry forward, as it eluted at approximately four times the monomer’s molecular weight, consistent with our proposed tetramer, which is depicted in Figure 4C. Protein was then concentrated and flash frozen as single-use aliquots for later use.
X- ray crystallography and structural analyses
An AcrIIA22 crystal was grown using 14mg/mL protein via the hanging drop method using 200mM ammonium nitrate, 40% (+/-)-2-methyl-2,4-pentanediol (MPD, Hampton Research), 10mM MgCl2 as a mother liquor. Diffraction data was collected at the Argonne National Laboratory Structural Biology Center synchrotron facility (Beamline 19BM). Data was processed with HKL2000 in space group P4332, then built and refined using COOT59 and PHENIX60. The completed 2.80Å structure was submitted to the Protein Data Bank with PDB Code 7JTA. The detailed PDB validation report is provided (Supplemental Dataset 7). We submitted this finished coordinate file to the PDBe PISA server (Protein Data Bank Europe, Protein Interfaces, Surfaces and Assemblies; http://pdbe.org/pisa/) which uses free energy and interface contacts to calculate likely multimeric assemblies27. The server calculated tetrameric, dimeric and monomeric structures to be thermodynamically stable in solution. The tetrameric assembly matches the molecular weight expected from the size exclusion column elution peak and is the most likely quaternary structure as calculated by the PISA server. The tetramer gains -41.8 kcal/mol free energy by solvation when formed and requires an external driving force of 3.1 kcal/mol to disassemble it according to PISA ΔG calculations.
The single-guide RNA (sgRNA) for use in in vitro experiments was generated as described previously12. We made the dsDNA template via one round of thermal cycling (98°C for 90 s, 55°C for 15 s, 72°C for 60 s) in 50 µl reactions. We used the Phusion PCR polymerase mix (NEB) containing 25 pmol each of the following two oligo sequences; the sequence that binds the protospacer on our pIDTsmart target vector is underlined:
The dsDNA templates were then purified using an Oligo Clean and Concentrator Kit (ZymoResearch) before quantification via the Nanodrop. Single-guide RNA (sgRNA) was transcribed from this double-stranded DNA (dsDNA) template by T7 RNA polymerase using Megashortscript Kit (Thermo Fisher #AM1354). Reactions were then treated with DNAse, extracted via phenol-chloroform addition and then chloroform addition, ethanol precipitated, resuspended in RNase free water, quantified by Nanodrop, analyzed for quality on 15% acrylamide/TBE/UREA gels, and frozen at −20°C.
Pulldown assay using twin-strep-tagged AcrIIA22 and AcrIIA4
The same buffer, consisting of 200 mM NaCl, 25 mM Tris (pH 7.5), was used for pulldowns and to dilute proteins. As a precursor to these assays, 130 pmol SpyCas9 and sgRNA were incubated together at room temperature for 15 minutes where indicated. SpyCas9, with or without pre-complexed sgRNA, was then incubated with 230 pmol AcrIIA4 or 320 pmol AcrIIA22 for 25 minutes at room temperature. Subsequently, 50 µl of a 10% slurry of Strep-Tactin Resin (IBA Lifesciences #2-1201-002), which was pre-equilibrated in binding buffer, was added to the binding reactions, and incubated at 4°C on a nutator for 45 minutes. Thereafter, all incubations and washes were carried out at 4°C or on ice. Four total washes of this resin were performed, which included one tube transfer. Washes proceeded via centrifugation at 2000 rpm for one minute, aspiration of the supernatant with a 25-gauge needle, and resuspension of the beads in 100 µl binding buffer. Strep-tagged proteins were eluted via suspension in 40 µl of 1x BXT buffer (100 mM Tris-Cl, 150 mM NaCl, 1 mM EDTA, 50 mM Biotin, pH 8.0) and incubated for 15 min at room temperature. After centrifugation, 30 µl of supernatant was removed and mixed with 4X reducing sample buffer (Thermo Fisher). Proteins were then separated by SDS PAGE on BOLT 4–12% gels in MES buffer (Invitrogen) and visualized by Coomassie staining.
SpyCas9 linear DNA cleavage assay
All SpyCas9 cleavage reactions using linear DNA were performed in cleavage buffer61 (20mM Tris HCl (pH7.5), 5% glycerol, 100mM KCl, 5mM MgCl2, 1mM DTT). In preparation for these reactions, all proteins were diluted in 30 mM NaCl / 25 mM Tris, pH 7.4 / 2.7mM KCl, whereas all DNA and sgRNA reagents were diluted in nuclease-free water. Where indicated, SpyCas9 (0.36 µM) was incubated with sgRNA (0.36 µM) for 10 minutes at room temperature. Before use, sgRNA was melted at 95°C for five minutes and then slowly cooled at 0.1 °C/s to promote proper folding. SpyCas9 (either pre-complexed with sgRNA or not, as indicated in Supplemental Figure 7) was then incubated for 10 minutes at room temperature with AcrIIA4 (2.9 µM) or AcrIIA22 at each of the following concentrations: 23.2, 11.6, 5.8, and 2.9 µM. As substrate, the plasmid pIDTsmart was linearized by restriction digest and used at a final concentration of 3.6 nM. The reaction was initiated by the addition of this DNA substrate either in isolation or in combination with sgRNA (0.36 µM) as indicated in Supplemental Figure 7. Reactions were immediately moved to a 37°C incubator and the reaction stopped after fifteen minutes via the addition of 0.2% SDS/100 mM EDTA and incubation at 75°C for five minutes. Samples were then run on a 1.5% TAE agarose gel at 120V for 40 minutes. Densitometry was used to calculate the proportion of DNA cleaved by SpyCas9; band intensities were quantified using the BioRad ImageLab software v5.0.
In vivo assay to assess impact of AcrIIA22 on plasmid topology
In all experiments, cultures were first grown overnight at 37°C with shaking at 220 rpm in LB with 0.5mM IPTG and, if included, spectinomycin at 50 µg/mL, and kanamycin at 50 µg/mL. For each sample with a SpyCas9-expressing plasmid (e.g. Figure 7A), overnight cultures were grown with spectinomycin and kanamycin and diluted 1:50 into LB with 0.5mM IPTG, spectinomycin (at 50 µg/mL), and, where indicated, doxycycline (at 100 ng/mL, to induce acr_s). Cultures were grown at 37°C with shaking at 220 rpm. If required, 0.2% (L)-arabinose was added after two hours of growth to induce _spyCas9 expression. The next morning, cultures were centrifuged at 4100g and plasmids purified using a miniprep kit (Qiagen). We measured the concentration of dsDNA in each miniprep using the Qubit-4 fluorometer and the associated dsDNA high sensitivity assay kit (Invitrogen). For each sample with a SpyCas9-expressing plasmid, 150ng of DNA was digested with the restriction enzyme HincII (NEB) per manufacturer’s recommendations, except that digests were incubated overnight before being stopped by heating at 65°C for 20 minutes. This restriction enzyme will cut once, only in the SpyCas9 plasmid, to linearize it. This allowed us to visualize the SpyCas9 plasmid as a single band, which allowed us to identify bands from acrIIA22-encoding undigested plasmids more easily. It also served as an internal control for plasmid DNA that is unaffected by SpyCas9 targeting or AcrIIA22 expression (Supplemental Figure 2). Following restriction digest, 30ng of sample was analyzed via gel electrophoresis using a 0.7% TAE-agarose gel run at 120V for 30 minutes.
In samples that lacked a SpyCas9-expressing plasmid (e.g. Figure 5A), overnight cultures were grown with kanamycin and diluted into LB. Where required, 0.5mM IPTG and doxycycline at 100 ng/mL were added to induce the gene of interest. The next morning, cultures were centrifuged at 4100g and plasmids purified using a miniprep kit (Qiagen). The concentration of dsDNA in each miniprep was measured using the Qubit-4 fluorometer and the associated dsDNA high sensitivity assay kit (Invitrogen). Then, 30ng of purified plasmid was directly analyzed by gel electrophoresis using a 0.7% TAE-agarose gel run at 120V for 30 minutes.
In vitro AcrIIA22 plasmid nicking assay
Except for the divalent cation experiment, all reactions were performed using NEB buffer 3.1 (100 mM NaCl, 50 mM Tris-HCl, pH 7.9, 10 mM MgCl2, 100 µg/mL BSA). To determine cation preference, the same reaction buffer was re-created, but MgCl2 was omitted. All proteins were diluted in 130 mM NaCl, 25 mM Tris, pH 7.4, 2.7 mM KCl. DNA was diluted in nuclease-free water. In the cation preference experiment, 60 µM His6-AcrIIA22 and 6 nM of purified pIDTsmart plasmid DNA were used. All other reactions were set up with AcrIIA22 constructs and concentrations indicated in figure panels and captions. In the cation preference experiment (Supplemental Figure 11A), reactions were started by adding 10 mM of the indicated cation. All other reactions were initiated via the addition of 2 nM pIDTsmart plasmid DNA. In these cases, reactions were immediately transferred to a 37°C incubator. At 0.5, 1, 2, 4, 6, or 20-hour timepoints, a subset of the reaction was removed and run on a 1.5% TAE agarose gel at 120V for 30 minutes. For the fractionation experiment depicted in Supplemental Figure 10B, 5ul of each concentrated fraction was used in a 15ul reaction volume and the reaction was incubated for 24 hours at 37°C. For the cation preference experiment, only the 2-hour timepoint was considered and the reaction was stopped via the addition of NEB loading buffer and 100 mM EDTA. In this case, DNA was visualized on a 1% TBE gel run for 60 minutes at 110V. Densitometry was used to calculate the proportion of DNA in each topological form via band intensities quantified using the BioRad ImageLab software v5.0.
SpyCas9 cleavage kinetics assay
All cleavage reactions were performed in the cleavage buffer61 containing 20mM Tris HCl (pH7.5), 5% glycerol, 100mM KCl, 5mM MgCl2, 1mM DTT. In preparation for these reactions, all proteins were diluted in 30 mM NaCl / 25 mM Tris, pH 7.4 / 2.7mM KCl, whereas all DNA and sgRNA reagents were diluted in nuclease-free water.
Purified pIDTsmart plasmid was pre-treated with either AcrIIA22, the nickase Nb.BssSI (NEB), or no enzyme. For the AcrIIA22 pre-treatment, 3.1 µg of plasmid was incubated with 230 µM AcrIIA22 and the plasmid nicked as described previously. Plasmid nicking with Nb.BssSI was carried out via manufacturer’s recommendations (NEB). Both reactions were incubated at 37 °C for 2 hours. To isolate the nicked plasmid, samples were then run on a 1.5% agarose gel for 2 hours and the open-circle form of the plasmid was excised and purified using the Zymo Research Gel DNA Recovery Kit. Untreated plasmid was also purified via gel extraction. Plasmid yield was quantified using a Nanodrop.
To determine SpyCas9’s substrate preference, we incubated each pre-treated plasmid substrate with SpyCas9 and assayed for the appearance of a linearized plasmid as indication of SpyCas9 digestion. In all cases, SpyCas9 was used at a final concentration of 0.32 µM. All reaction components except dsDNA were added on ice, following which SpyCas9 was complexed with equimolar levels of its sgRNA for ten minutes at room temperature. Before addition to the reaction, sgRNA was melted at 95°C for five minutes and then slowly cooled at 0.1 °C/s to promote proper folding. To begin the reaction, DNA substrate was added to the reaction mix at a final concentration of 2 nM and the samples moved immediately to 37 °C. At each timepoint, a subset of the reaction was removed, and digestion stopped with 0.2% SDS/100 mM EDTA and by incubating at 75°C for 5 minutes. Samples were run on a 1.5% TAE gel at 120V for 40 minutes and densitometry was used to calculate the proportion of DNA in each topological form via band intensities quantified with the BioRad ImageLab software v5.0.
Article TitleThe novel anti-CRISPR AcrIIA22 relieves DNA torsion in target plasmids and impairs SpyCas9 activity
To overcome CRISPR-Cas defense systems, many phages and mobile genetic elements encode CRISPR-Cas inhibitors called anti-CRISPRs (Acrs). Nearly all characterized Acrs directly bind Cas proteins to inactivate CRISPR immunity. Here, using functional metagenomic selection, we describe AcrIIA22, an unconventional Acr found in hypervariable genomic regions of clostridial bacteria and their prophages from human gut microbiomes. AcrIIA22 does not bind strongly to SpyCas9 but nonetheless potently inhibits its activity against plasmids. To gain insight into its mechanism, we obtained an X-ray crystal structure of AcrIIA22, which revealed homology to PC4-like nucleic-acid binding proteins. Based on mutational analyses and functional assays, we deduced that acrIIA22 encodes a DNA nickase that relieves torsional stress in supercoiled plasmids. This may render them less susceptible to SpyCas9, which uses free energy from negative supercoils to form stable R-loops. Modifying DNA topology may provide an additional route to CRISPR-Cas resistance in phages and mobile genetic elements.