Endogenous protein tagging in medaka using a simplified CRISPR/Cas9 knock-in approach
CRISPR/CAS9 endogenous protein Molecular Biology

Animal husbandry and ethics

Medaka (Oryzias latipes) (Iwamatsu, 2004, Naruse et al., 2004, Kasahara et al., 2007) were maintained as closed stocks in a fish facility built according to the European Union animal welfare standards and all animal experiments were performed in accordance with European Union animal welfare guidelines. Animal experimentation was approved by The EMBL Institutional Animal Care and Use Committee (IACUC) project code: 20/001_HD_AA. Fishes were maintained in a constant recirculating system at 27-28°C with a 14hr light / 10hr dark cycle.

Cloning-free CRISPR/Cas9 Knock-Ins

A detailed step-by-step protocol for the cloning-free approach is provided in Files S1/S2. A detailed list of all repair donors, PCR primers, fluorescent protein sequences and sgRNAs used is provided in Tables S1-S6. Briefly, for the preparation of Cas9-mSA mRNA: the pCS2+Cas9-mSA plasmid was a gift from Janet Rossant (Addgene #103882) (Gu et al., 2018). 6-8 µg of Cas9-mSA plasmid was linearized by Not1-HF restriction enzyme (NEB #R3189S). The 8.8kb linearized fragment was cut out from a 1.5% agarose gel and DNA was extracted using QIAquick Gel Extraction Kit (Qiagen #28115). In vitro transcription was performed using mMachine SP6 Transcription Kit (Invitrogen #AM1340) following the manufacturer’s guidelines. RNA cleanup was performed using RNAeasy Mini Kit (Qiagen #74104). sgRNAs were manually selected using previously published recommendations (Paix et al., 2017a, Paix et al., 2019, Doench et al., 2016, Gagnon et al., 2014) and in silico validated using CCTop and CHOPCHOP (Labun et al., 2019, Stemmer et al., 2015) (Table S5). The genomic coordinates of all genes targeted can be found in Table S1. Synthetic sgRNAs used in this study were ordered from Sigma-Aldrich (spyCas9 sgRNA, 3nmole, HPLC purification, no modification). PCR repair donor fragments were prepared as described previously (Paix et al., 2014, Paix et al., 2017b, Paix et al., 2015, Paix et al., 2016) and a detailed protocol is provided in File S1. Briefly the design includes approx. 30-40bp of homology arms and a fluorescent protein sequence with no ATG or Stop codon (Tables S2/S4). PCR amplifications were performed using Phusion or Q5 high fidelity DNA polymerase (NEB Phusion Master Mix with HF buffer #M0531L or NEB Q5 Master Mix # M0492L). MinElute PCR Purification Kit (Qiagen #28004) was used for PCR purification. Primers were ordered from Sigma-Aldrich (25nmole scale, desalted) and contained Biotin moiety on the 5’ ends for repair donor synthesis. A list of all primers and fluorescent protein sequences used in this study can be found in Table S3/S4. The injection mix in medaka contains the sgRNA (15-20 ng/ul) + Cas9-mSA mRNA (150 ng/ul) + repair donor template (8-10 ng/ul). For injections, male and female medakas are added to the same tank and fertilized eggs collected 20 minutes later. The mix is injected in 1-cell staged medaka embryos (Iwamatsu, 2004), and embryos are raised at 28°C in 1XERM (Seleit et al., 2017a, Seleit et al., 2017b, Rembold et al., 2006). A list of KI lines generated and maintained in this study can be found in Table S6.

Live-imaging sample preparation

Embryos were prepared for live-imaging as previously described (Seleit et al., 2017a, Seleit et al., 2017b). 1X Tricaine (Sigma-Aldrich #A5040-25G) was used to anesthetize dechorionated medaka embryos (20 mg/ml – 20X stock solution diluted in 1XERM). Anesthetized embryos were then mounted in low melting agarose (0.6 to 1%) (Biozyme Plaque Agarose #840101). Imaging was done on glass-bottomed dishes (MatTek Corporation Ashland, MA 01721, USA). For g3bp1-eGFP live-imaging, temperature was changed from 21°C to 34°C after one hour of imaging.

Microscopy and data analysis

For all embryo screening, a Nikon SMZ18 fluorescence stereoscope was used. All live-imaging, except for g3bp1-eGFP and cdh2-eGFP embryos, was done on a laser-scanning confocal Leica SP8 (CSU, White Laser) microscope, 20x and 40x objectives were used during image acquisition depending on the experimental sample. For the SP8 confocal equipped with a white laser, the laser emission was matched to the spectral properties of the fluorescent protein of interest. g3bp1-eGFP line live-imaging was performed using a Zeiss LSM780 laser-scanning confocal with a temperature control box and an Argon laser at 488 nm, imaged through a 20x plan apo objective (numerical aperture 0.8). For cdh2-eGFP 4D live-imaging was performed on a Luxendo TruLive SPIM system using a 30X objective. Open-source standard ImageJ/Fiji software (Schindelin et al., 2012) was used for analysis and editing of all images post image acquisition. Stitching was performed using standard 2D and 3D stitching plug-ins on ImageJ/Fiji. For quantitative values on endogenous mScarlet-pcna dynamics ROI manager in ImageJ/Fiji was used to define fluorescence intensity within the nucleus of tracked cells (Yellow circle in Figure 4 and Supplementary Movie S10/11), fluorescent intensity measurements were then extracted from the time-series and the data was normalized by dividing on the initial intensity value in each time-lapse movies. Data was plotted using R software. Pixel intensity distribution within nuclei were analyzed using a custom python based script. Individual live-cell tracks were plotted using PlotTwist (Goedhart, 2020).

Fin-clips, genotyping and sanger sequencing

Individual adult F1 fishes were fin-clipped for genotyping PCRs. Briefly, fish were anesthetized in 1X Tricaine solution. A small part of the caudal fin was cut by sharp scissors and placed in a 2ml Eppendorf tube containing 50ul of fin-clip buffer. The fish were recovered in small beakers and were transferred back to their tanks. Eppendorf tubes were then incubated overnight at 65°C. 100µl of H2O was then added to each tube and then the tubes were incubated for 10-15 min at 90°C. Tubes were then centrifuged for 30 minutes at 10,000 rpm in a standard micro-centrifuge. Supernatant was used for subsequent PCRs. Fin-clip buffer is composed of 100 ml 2M Tris pH 8.0, 5 ml 0.5M EDTA pH 8.0, 15 ml 5M NaCl, 2.5 ml 20% SDS, H2O to 500 ml, sterile filtered. 50 ul of proteinase K (20 mg/ml) was added to 1 ml fin clip buffer before use. 2ul of genomic DNA from fin-clips was used for genotyping PCRs. A list of all genotyping primers used in this study can be found in Table S3. After PCRs the edited and WT amplicons were sent to Sanger sequencing (Eurofins Genomics). Sequences were analyzed using Geneious software (Figure S2). In-frame integrations were confirmed by sequencing for eGFP-cbx1b, mScarlet-pcna, mNeonGreen-myosinhc and eGFP-rab11a. We were able to detect an internal partial duplication of the 5’ homology arm in the mScarlet-pcna line that does not affect the protein coding sequence nor the 5’ extremity of the homology arm itself. Specifically, 22 base pairs upstream of the Start codon of pcna (and within the 5’ homology arm); we detect a 21 bp partial duplication of the 5’ homology arm and a 7bp insertion GGTCGAC indicative that the repair mechanism involved can lead to errors (Paix et al., 2017a). The 5’ homology junction itself is unaltered and precise.

Whole Genome Sequencing (WGS)

5 to 10 positive F1 medaka embryos (originating from the same F0 founder) of the eGFP-cbx1b, mScarlet-pcna and mNeonGreen-myosinhc lines were snap frozen in liquid nitrogen and kept at -80°C in 1.5ml Eppendorf tubes. Genomic DNA was extracted using DNeasy Blood and Tissue Kit (Qiagen #69504) according to the manufacturer’s guidelines. The libraries were prepared on a liquid handling system (Beckman i7 series) using 200 ng of sheared gDNA and 10 PCR cycles using the NEBNext Ultra II DNA Library Prep Kit for Illumina (NEB #E7645S). The DNA libraries were indexed with unique dual barcodes (8bp long), pooled together and then sequenced using an Illumina NextSeq550 instrument with a 150 PE mid-mode in paired-end mode with a read length of 150bp. Sequenced reads were aligned to the Oryzias latipes reference genome (Ensembl! Assembly version ASM223467v1) using BWA mem version 0.7.17 with default settings (Li and Durbin, 2009). The reference genome was augmented with the known inserts for eGFP, mScarlet and mNeonGreen to facilitate a direct integration discovery using standard inter-chromosomal structural variant predictions. The insert sequences are provided in Tables S2/S4. After the genome alignment, reads were sorted and indexed using SAMtools (Li et al., 2009). Quality control and coverage analyses were performed using the Alfred qc subcommand (Rausch et al., 2019). For Structural Variant (SV) discovery, aligned reads were processed with DELLY v0.8.7 (Rausch et al., 2012) using paired-end mapping and split-read analysis. SVs were filtered for inter-chromosomal SVs with one breakpoint in one of the additional insert sequences (eGFP, mScarlet and mNeonGreen). Plots shown in Figure S2 are adapted from Integrative Genomics Viewer (IGV) (Thorvaldsdottir et al., 2013). The estimated genomic coordinates for integration are: eGFP-cbx1b (chr19:19,074,552), mScarlet-pcna (chr9:6,554,003) and mNeonGreen-myosinhc (chr8:8,975,799). Coverage of eGFP-cbx1bgDNA1 is 20.4X and _eGFP-cbx1bgDNA2 is 23.6X. Coverage of _mScarlet-pcna is 14.4X. Coverage of mNeonGreen-myosinhc is 14.5X. Raw sequencing data was deposited in European Nucleotide Archive (ENA) under study number ERP127162. Accession numbers are: eGFP-cbx1b(1) ERS5796960 (SAMEA8109891), eGFP-cbx1b(2) ERS5796961 (SAMEA8109892), mScarlet-pcna ERS5796962 (SAMEA8109893) and mNeonGreen-myosinhc ERS5796963 (SAMEA8109894).

Article TitleEndogenous protein tagging in medaka using a simplified CRISPR/Cas9 knock-in approach


The CRISPR/Cas9 system has been used to generate fluorescently labelled fusion proteins by homology directed repair in a variety of species. Despite its revolutionary success, there remains an urgent need for increased simplicity and efficiency of genome editing in research organisms. Here, we establish a simplified, highly efficient and precise strategy for CRISPR/Cas9 mediated endogenous protein tagging in medaka (Oryzias latipes). We use a cloning-free approach that relies on PCR amplified donor fragments containing the fluorescent reporter sequences flanked by short homology arms (30-40bp), a synthetic sgRNA and streptavidin tagged Cas9. We generate six novel knock-in lines with high efficiency of F0 targeting and germline transmission. Whole Genome Sequencing (WGS) results reveal single-copy integration events only at the targeted loci. We provide an initial characterization of these fusion-protein lines, significantly expanding the repertoire of genetic tools available in medaka. In particular, we show that the mScarlet-pcna knock-in line has the potential to serve as an organismal-wide label for proliferative zones and an endogenous cell cycle reporter.

Login or Signup to leave a comment
Find your community. Ask questions. Science is better when we troubleshoot together.
Find your community. Ask questions. Science is better when we troubleshoot together.

Have a question?

Contact or check out our support page.