Plasmid preparation, protein expression and purification
CasΦ3 cDNA was synthetized and cloned with a C-terminal hexahistidine (His)-tag into pET-21 vector (Genewiz) (Extended Data Table II). CasΦ3 mutants were generated with the In-Fusion cloning kit (Takara) using the primers specified in Extended Data Table II. To generate CasΦ3-ΔCT, a TEV cleavage site (ENLYFQG) was generated after the residue M726. His-tagged CasΦ3 was expressed from pET-21 in E. coli BL21 pRARE cells. E. coli cultures were grown at 37° C in liquid Terrific Broth (TB) medium with 34 mg/l chloramphenicol and 100 mg/l ampicillin to an optical density at 600 nm of ~ 0.8. Overexpression of proteins was induced with 150 nM of IPTG for 16h at 16°C. Cells were harvested by centrifugation and resuspended in lysis buffer (50 mM HEPES pH7.5, 2M NaCl, 5 mM MgCl2, 1 tablet of Complete Inhibitor cocktail EDTA Free (Roche) per 50◻ml, 50◻U/ml Benzonase, 1◻mg/ml lysozyme). Lysis was completed by one freeze-thaw cycle and sonication. Cell extract was diluted to a final salt concentration of 500 mM, and high-speed centrifuged (10,000 x g, 45 min) to separate the soluble fraction from the insoluble fraction and the cell debris. The soluble fraction was loaded into a 5 ml HisTrap FF Crude column (Cytiva) equilibrated in buffer IMAC-A (20 mM HEPES pH7.5, 500 mM NaCl, 20 mM Imidazole), and bound proteins were eluted by stepwise increase of the imidazole concentration with buffer IMAC-B (20 mM HEPES pH7.5, 200 mM KCl, 500 mM Imidazole). CasΦ3 proteins eluted at ~150 mM Imidazole. In the case of CasΦ3-ΔCT, the C-terminal segment (residues 727-766) was cleaved by incubating the protein with 0.3 mg TEV protease in TEV buffer (20 mM HEPES pH 7.5, 150 mM NaCl, 1 mM EDTA, 0.5 mM TCEP) for 16 h at 4 °C. Fractions containing CasΦ3 were pooled, concentrated and further purified by size exclusion chromatography (SEC) using a HiLoad 16/600 Superdex 200 column (Cytiva) equilibrated in SEC buffer (20 mM HEPES pH7.5, 500 mM KCl, 0.5 mM TCEP). Fractions containing pure protein were pooled, concentrated to 5-10 g/L, flash-frozen in liquid nitrogen and stored at −80 °C.
Fluorescein (FAM)-labeled DNA oligonucleotide at 5’ or 3’ ends, unlabeled DNA and RNA oligonucleotides were purchased from Integrated DNA technologies (IDT) (Extended Data Table II). dsDNA substrates were prepared by mixing ssDNA oligos to a final concentration of 80 μM in annealing buffer (20 mM HEPES pH7.5, 200 mM KCl), denaturation at 95 °C for 10 min and gradually temperature decrease to 4 °C during 20 minutes in a thermal cycler (Applied Biosystems). Ribonucleoprotein complexes (RNP) of CasΦ3 were formed by mixing an equal volume of 50 μM CasΦ3 and 50 μM CasΦ3 mature crRNA (IDT).
For specific dsDNA cleavage assays, FAM-labeled dsDNA substrates were incubated at 400 nM with 2 μM of CasΦ3 RNP in cleavage buffer (20 mM HEPES pH7.5, 160 mM KCl, 10% glycerol, 5 mM MgCl2) for 2h at 37 °C, or as otherwise stated in the figure legends. For ion dependency assays (Extended Data Fig. 3e) 5mM MgCl2 was substituted by 5mM Ethylenediaminetetraacetic acid (EDTA), CaCl2, MnCl2, FeSO4, CoCl2, NiSO4, CuCl2, ZnSO4. For DNA saturation experiments (Extended Data Fig. 3f) 1uM of CasΦ3 RNP was incubated with 0.5−8 uM of labelled dsDNA for 2h at 37°C. For non-specific trans ssDNA cleavage assays (Extended Data Fig. 2b-c, 8b-c), 0.4 μM FAM-labeled non-specific ssDNA substrate (i.e., not complementary to the crRNA) was incubated with 2 μM CasΦ3 RNP as described above, along with 0.1 μM of unlabeled activator ssDNA or dsDNA (complementary to the crRNA) in cleavage buffer for 1 h at 37°C. The reactions were stopped by adding equal volumes of stop buffer (8 M Urea, 100 mM EDTA at pH8) followed by incubation at 95°C for 5 min. Cleavage products were resolved on 15% Novex TBE-Urea Gels (Invitrogen), run according to manufacturer’s instructions. Gels were imaged using an Odyssey FC Imaging System (Li-Cor). Densitometric analysis of bands in gels was performed using ImageJ. The cleavage efficiency was calculated as the intensity of the bands corresponding to the products divided by the total intensity for the specific dsDNA cleavage assays, or as the depletion of signal of the non-cleaved product for non-specific ssDNA degradation assays.
Sample preparation for Cryo-EM
For the preparation of the Cryo-EM sample, Ni2+ was used as a catalytic ion instead of Mg2+ due to the higher yield obtained with this metal. CasΦ3 RNP was prepared as described before. 25 nmol of RNP and 37 nmol of unlabeled dsDNA substrate were incubated in 25 ml of MonoQ A buffer (20 mM HEPES pH7.5, 200 mM KCl, 1 mM NiSO4, 0.5 mM TCEP) for 2h at 20°C to allow DNA cleavage. The product of the reaction was loaded in a MonoQ column equilibrated with MonoQ A buffer, and CasΦ3 R-loop complex was separated from the RNP and the unbound DNA substrate by a salt gradient elution using MonoQ B buffer (20 mM HEPES pH7.5, 2 M KCl, 1 mM NiSO4, 0.5 mM TCEP). CasΦ3 R-loop eluted at 16-20 % of MonoQ buffer B (~500 mM KCl). The R-loop complex was further purified from unbound DNA by SEC using a Superdex 200 Increase 10/300 GL column (Cytiva) equilibrated with MonoQ A buffer. The molecular weight of the complex and the sample homogeneity was estimated using a Refeyn One mass photometer (Refeyn), using 10-20 nM of protein diluted in MonoQ A buffer (Extended Data Fig. 3A). 2.5 μL of freshly purified CasΦ3 R-loop complex (Absorbance260 nm of ~1.6) was applied to UltrAuFoil 300 mesh R0.6/1.0 holey grids (Quantifoil), glow-discharged for 60 s at 10 mA (Leica EM ACE200), and plunge-frozen in liquid ethane (pre-cooled with liquid nitrogen) using a Vitrobot Mark IV (FEI, Thermo Fisher Scientific) using the next conditions: blotting time 3 s, 100% humidity and 4° C.
CryoEM Data Collection and Processing
Movies were collected on Titan Krios G3 Cryo-TEM equipped with a TFS Falcon III camera operated at 300 keV in counting mode. Exposure 1.05 e/Å2/frame, in 40 frames and hence a final dose of 42 e/Å2. The calibrated pixel size was 0.832 Å/px. All movies were pre-processed using WARP 1.0.931 (Extended Data Fig. 3). Motion correction was performed with a temporal resolution of 20 for the global motion and 5◻×◻5 spatial resolution for the local motion. We considered motion in the 45–3 Å range weighted with a B-factor of −500◻Å2. Only Micrographs displaying less than 5 Å intraframe motion were used. CTF estimation was performed using 5◻×◻5 patches in the 35-4 Å range. We selected micrographs with fitted defocus between 0.0 and 5.0◻μm, and a resolution better than 5◻Å. For the particle picking, the micrographs were masked, and particles were picked using a re-trained BoxNet deep convolutional neural network. This resulted in 3,504,102 particles from 4,393 micrographs. Particles were extracted with a box size of 256×256 and a pixel size of 0.832 which were inverted and normalized before being imported into RELION 3.1 32 for 2D classification. The selected 2D classes were imported in cryoSPARC 3.1.0 33 where they were 3D classified into four initial classes. The volume with the largest number of particles was 3D autorefined to an initial 2.61 Å resolution map. The conformational heterogeneity of the particles used in this volume was inspected through a 3D variability analysis job, and the two more divergent volumes were used as input for heterogeneous refinement. The 3D variability of the particles in the best volume was further analysed followed by heterogeneous refinement with four classes. The resulting four volumes were non-uniform refined to obtain maps at 2.7−3.3 Å resolution. The two best maps (2.7 and 2.9 Å resolution) represent the different conformational states of the complex that are discussed in the text. Sharpened and local resolution maps were calculated with PHENIX34, and directional resolution anisotropy analysis were performed with the 3D-FSC server 35.
Atomic model building and refinement
An initial model containing the complete DNA and RNA sequence and ~50% of the protein sequence was built ab initio using map-to-model implemented in PHENIX34. COOT36 was used to connect, extend and correct the protein fragments to generate a model covering ~70% of the protein sequence. The rest of the model was autobuilt by using buccaneer implemented in CCP-EM37, and subsequently corrected in COOT. The final model was obtained after several rounds of refinement using phenix.real_space_refine and manual inspection and correction in COOT. The final model covers 92% of the protein sequence, mainly lacking a C-terminal segment predicted to be unstructured. Map and molecular model images were created using ChimeraX38.
CRISPR-CasΦ is a novel family of miniaturized RNA-guided endonucleases from phages 1,2. These novel ribonucleoproteins (RNPs) provide a compact scaffold gathering all key activities of a genome editing tool2. Here, we provide the first structural insight into CasΦ singular DNA targeting and cleavage mechanism by determining the cryoEM structure of CasΦ3 with the triple strand R-loop generated after DNA cleavage. The structure reveals the unique machinery for target unwinding to form the crRNA-DNA hybrid and cleaving the target DNA. The protospacer adjacent motif (PAM) is recognised by the target strand (T-strand) and non-target strand (NT-strand) PAM interacting domains (TPID and NPID). Unwinding occurs after insertion of the conserved α1 helix disrupting the dsDNA, thus facilitating the crRNA-DNA hybrid formation. The NT-strand is funnelled towards the RuvC catalytic site, while a long helix of TPID separates the displaced NT-strand and the crRNA-DNA hybrid avoiding DNA re-annealing. The crRNA-DNA hybrid is directed to the stop (STP) domain that splits the hybrid guiding the T-strand towards the RuvC active site. The conserved RuvC insertion of the CasΦ family is extended along the hybrid, interacting with the phosphate backbone of the crRNA. A cluster of hydrophobic residues anchors the RuvC insertion in a cavity of the STP domain. The assembly of the hybrid promotes the shortening of the RuvC insertion, thus pulling the STP towards the RuvC active site to activate catalysis. These findings illustrate why CasΦ unleashes unspecific cleavage activity, degrading ssDNA molecules after activation. Site-directed mutagenesis in key residues support CasΦ3 target DNA and non-specific ssDNA cutting mechanism. Our analysis provides new avenues to redesign the compact CRISPR-CasΦ nucleases for genome editing.