Materials and Methods

Structural basis of target DNA recognition by CRISPR-Cas12k for RNA-guided DNA transposition

Protein expression and purification

Gene fragments of TnsB, TnsC, and TniQ were ordered from Integrated DNA Technologies (IDT) and cloned into bacterial expression plasmid pET28-MKH8SUMO (Addgene: #79526). The gene fragment for Cas12k was cloned into the bacterial expression plasmid pET-His6-StrepII-TEV LIC (Addgene: #29718). Cas12k, TnsB, and TniQ were expressed in Escherichia coli BL21(DE3) while TnsC was expressed in Rosetta™(DE3)pLysS (Novagen: #70956) containing a pLysS-tRNA plasmid. Cells were grown to OD600 =0.6 in Terrific Broth (TB) and protein expression was induced by adding 0.3 mM of IPTG followed by overnight incubation at 16°C. The cells were collected and resuspended in lysis buffer (50 mM Tris-HCl, pH□7.6, 500 mM NaCl, 5% glycerol) supplemented with 1 mM PMSF and 5 mM β-mercaptoethanol, and then disrupted by sonication. Cell lysate was clarified by centrifugation. The supernatant was loaded onto Ni-NTA resin. After extensive washing with lysis buffer supplemented with 30 mM imidazole, target proteins were eluted with lysis buffer supplemented with 250 mM imidazole. The His-SUMO tag of TnsB, TnsC, and TniQ and His-StrepII tag of Cas12k were removed by overnight digestion with TEV protease at 4°C. The protein was diluted with buffer containing 50 mM Tris-HCl pH 7.6, 200 mM NaCl, and 5% glycerol and loaded onto a Heparin column (GE Healthcare), eluted with a linear NaCl gradient (0.1 to 1M). After concentration, the proteins were further purified by size exclusion chromatography (SEC) over a Superdex 200 increase 10/300 GL column (Cytiva) in buffer containing 25 mM Tris-HCl (pH□7.6), 500 mM NaCl, 10% glycerol, and 1 mM DTT (0.5 mM EDTA was added to the buffer for TnsC). Fractions were concentrated and stored at −80°C.

To assemble the Cas12k–sgRNA binary complex, Cas12k proteins were incubated with sgRNA (Table S1) at a ratio of 1:1.15 at 37°C for 30 min in buffer A (25 mM Tris-HCl, pH□7.6, 150 mM NaCl, 2 mM DTT and 1 mM MgCl2). To reconstitute the Cas12k–sgRNA–target DNA ternary complex, Cas12k protein was incubated with sgRNA at 37°C for 30 min followed by the addition of target DNA synthesized from IDT (Table S1) at a ratio of 1:1.1.5:1.3. After 30 min, the mixture was subjected to SEC over a Superdex 200 column (Cytiva) equilibrated with buffer A for further purification.

sgRNA preparation

sgRNAs were produced by in vitro transcription using the HiScribe T7 High Yield RNA synthesis kit (NEB) with PCR amplified gBlocks (IDT) as templates. sgRNAs were purified over a Resource-Q column (Cytiva) and eluted with a linear NaCl gradient (50 mM–1000 mM) in 25 mM Tris-HCl, pH 8.0. The eluted sgRNAs were concentrated and stored at −80°C


Single amino acid mutations were introduced by the QuikChange site-directed mutagenesis method. Mutations with multiple amino acids were introduced by ligating inverse PCR-amplified backbone with mutations bearing DNA oligonucleotides via the In-Fusion Cloning Kit (ClonTech). All mutants were confirmed by Sanger sequencing.

In vitro transposition assay

Donor plasmid (pDonor) and target plasmid (pTarget) were gifts from Feng Zhang (Addgene #127924 and #127926, respectively). In vitro transposition reaction was conducted as previously described unless otherwise stated. All proteins were diluted to 2 μM with 25 mM Tris-HCl, pH 8.0, 500 mM NaCl, 1 mM EDTA, 1 mM DTT, and 25% glycerol. 50 nM of each proteins, 600 nM sgRNA, 20 ng pTarget, and 100 ng pDonor were added sequentially to the reaction buffer containing 26 mM HEPES pH 7.5, 4.2 mM Tris-HCl pH 8.0, 2.1 mM DTT, 0.05 mM EDTA, 0.2 mM MgCl2, 28 mM NaCl, 21 mM KCl, 1.35% glycerol, 50 μg/mL BSA, and 2 mM ATP (final pH 7.5) to a total volume of 20 μL. Reactions were incubated at 30°C for 40 min before being supplemented with 25 mM MgOAc2 and incubated at 37°C for another 2 hours. 1 μL of the final products was taken out for direct PCR readout. The remaining sample was digested with 1 μL of Proteinase K (Thermo Fisher Scientific) at 37°C for 15 min before transformation into Stellar competent cells. Colonies were grown on kanamycin and chloramphenicol plates. Single colonies were randomly picked for plasmid preparation. After extraction, plasmids were analyzed by PCR, restriction enzyme (BamHI) digestion and sanger sequencing.

Polymerase Chain Reaction (PCR)

Forward primer pTargetF, reverse primer pDonor_R (Table S1), and _in vitro transposition reaction product were mixed to a final volume of 25 μL for PCR reactions. Cycling conditions were as follows: 1 cycle, 94°C, 3 min; 35 cycles, 98°C, 10 s, 66.9°C, 15 s, 72°C, 8 s; 1 cycle, 72°C, 10 min. Plasmids extracted from single colonies were analyzed by PCR under cycling conditions as follows: 1 cycle, 98°C, 3 min; 35 cycles, 98°C, 10 s, 69.9°C, 15 s, 72°C, 12 s; 1 cycle, 72°C, 10 min.

Electron Microscopy

Aliquots of 4 μL Cas12k–sgRNA binary complex (1 mg/mL) and Cas12k–sgRNA–dsDNA ternary complex (1 mg/mL) were applied to glow-discharged UltrAuFoil holey gold grids (R1.2/1.3, 300 mesh). The grids were blotted for 2 seconds and plunged into liquid ethane using a Vitrobot Mark IV. Cryo-EM data were collected with a Titan Krios microscope operated at 300 kV and images were collected using Leginon (Suloway et al., 2005) at a nominal magnification of 81,000x (resulting in a calibrated physical pixel size of 1.05 Å/pixel) with a defocus range of 0.8–2.0 μm. The images were recorded on a K3 electron direct detector in super-resolution mode at the end of a GIF-Quantum energy filter operated with a slit width of 20 eV. A dose rate of 20 electrons per pixel per second and an exposure time of 3.12 seconds were used, generating 40 movie frames with a total dose of ~ 54 electrons per Å2. Statistics for cryo-EM data are listed in Table 1.

Image Processing

Movie frames were aligned using MotionCor2 (Zheng et al., 2017) with a binning factor of 2. The motion-corrected micrographs were imported into cryoSPARC (Punjani et al., 2017). Contrast transfer function (CTF) parameters were estimated using CTFFIND4 (Rohou and Grigorieff, 2015). A few thousand particles were auto-picked without template to generate 2D averages for subsequent template-based auto-picking. The auto-picked and extracted particles were processed for 2D classifications, which were used to exclude false and bad particles that fell into 2D averages with poor features. An initial reconstruction was done in cryoSPARC using 100,000 particles (Punjani et al., 2017). Heterogenous refinement was further performed to sort out different conformational heterogeneity. To further screen homogenous particles, 3D variance analysis (Punjani and Fleet, 2021) was performed and the resulting maps with different conformations (frame_000.mrc and frame_019.mrc) are used for supervised heterogenous refinement. The homogeneous dataset was used for final 3D refinement with C1 symmetry, resulting in 3.65 Å resolution from 183,870 particles.

The Cas12k–sgRNA binary complex dataset were processed in a similar way as the ternary complex. 114,383 particles were selected for a final reconstruction at 3.80 Å resolution. Cryo-EM image processing is summarized in Table 1.

Model building, refinement, and validation

De novo model building of the Cas12k–sgRNA–target DNA structure was performed manually in COOT (Emsley et al., 2010) guided by secondary structure predictions from PSIPRED (Jones, 1999) of Cas12k protein and structure prediction of sgRNA by RNAComposer (Biesiada et al., 2016). Refinement of the structure models against corresponding maps were performed using the phenix.real_space_refine tool in Phenix (version 1.19.2) (Afonine et al., 2018). For the Cas12k–sgRNA complex, the structure model of the Cas12k–sgRNA–target-DNA complex was fitted into the cryo-EM map with models for target DNA deleted. The model is adjusted by all-atom refinement in COOT with self-restrains. The resultant model was refined against the corresponding cryo-EM map using the phenix.real_space_refine tool in Phenix.

Structure-based sequence alignment

PROMALS3D program (Pei et al., 2008) was used to align the sequences of Cas12k and Cas12f based on structure. The alignment diagram was plotted using ESPript (Robert and Gouet, 2014). Sequence identities and similarities were calculated using Sequence Manipulation Suite (Stothard, 2000). Root-mean-square deviation (RMSD) of the Cα atomic was calculated using the cealign command in PyMOL.

Structural visualization

Figures were generated using PyMOL and UCSF Chimera (Pettersen et al., 2004).

Article TitleStructural basis of target DNA recognition by CRISPR-Cas12k for RNA-guided DNA transposition


The type V-K CRISPR-Cas system, featured by Cas12k effector with a naturally inactivated RuvC domain and associated with Tn7-like transposon for RNA-guided DNA transposition, is a promising tool for precise DNA insertion. To reveal the mechanism underlying target DNA recognition, we determined a cryo-EM structure of Cas12k from cyanobacteria Scytonema hofmanni in complex with a single guide RNA (sgRNA) and a double-stranded target DNA. Coupled with mutagenesis and in vitro DNA transposition assay, our results revealed mechanisms for the recognition of the GGTT PAM sequence and the structural elements of Cas12k critical for RNA-guided DNA transposition. These structural and mechanistic insights should aid in the development of type V-K CRISPR-transposon systems as tools for genome editing.

Login or Signup to leave a comment
Find your community. Ask questions. Science is better when we troubleshoot together.
Find your community. Ask questions. Science is better when we troubleshoot together.

Have a question?

Contact or check out our support page.