NHGRI-1 wild-type zebrafish 49 were maintained through standard protocols 54 and their use was approved by the Institutional Animal Care and Use Committee from the Office of Animal Welfare Assurance, University of California, Davis. Animals were kept in a temperature (28±0.5°C) and light (10 h dark/14 h light cycle) controlled modular system with UV-sterilized filtered water (Aquaneering, San Diego, CA), with a density of 25 adult fish per tank. Feeding and general monitoring of all zebrafish was performed twice a day (9 am and 4 pm). Food included rotifers (Rotigrow Nanno, Reed Mariculture, Campbell, CA), brine shrimp (Artemia Brine Shrimp 90% hatch, Aquaneering, San Diego, CA), and flakes (Zebrafish Select Diet, Aquaneering, San Diego, CA). For all experimental procedures, eggs were collected via natural spawning of randomly selected adult NHGRI-1 zebrafish in 1 liter crossing tanks (Aquaneering, San Diego, CA), using a minimum of five breeding pairs (1 male, 1 female) unless otherwise specified. Embryos were grown in standard Petri dishes with E3 media (0.03% Instant Ocean salt in deionized water) and incubated at 28±0.5°C, using a dissecting microscope (Leica, Buffalo Grove, IL) for developmental staging and daily monitoring until their use for molecular procedures.
Design and in silico predictions for gRNAs
50 gRNAs targeting exons of 14 genes were designed using CRISPRScan 26 (scores ranging between 24 and 83 with a mean value of 57.6) with zebrafish genome version GRCz11/danRer11 as the reference (see description of gRNAs in Supplementary Tables 1 and 2). All targeted genes were protein coding. For each designed gRNA, we obtained the efficiency scores predicted by CRISPRScan 26, CHOPCHOP 35–37 using the scoring method from 38 and 39, E-CRISP 40, CRISPR-GE 41, CCTop 42, CRISPRon 43, DeepSpCas9 44, and the IDT design tool (www.idtdna.com). From CRISPRScan, we also gathered the predicted off-target sites for each gRNA defined by the CFD score 26. Additionally, we utilized bedtools 55 to determine the GC percentage for each gRNA. To incorporate NHGRI-1 variants into the zebrafish reference, we used the FastaAlternateReferenceMaker function from GATK 56 with the reported high-confidence variants for the NHGRI-1 zebrafish strain 49.
Microinjections to generate CRISPR G0 mosaic mutants
All gRNAs were individually injected into NHGRI-1 embryos to estimate the frequency of indels. gRNAs were prepared following the manufacturer’s protocol (Integrated DNA Technologies). Briefly, 2.5 μl of 100 μM crRNA, 2.5 μl of 100 μM tracrRNA, and 5 μl of Nuclease-free Duplex Buffer using an annealing program consisting of 5 min at 95°C, a ramp from 95°C to 50°C with a −0.1°C/s change, 10 minutes (min) at 50°C, and a ramp from 50°C to 4°C with a −1°C/s change. Ribonucleoprotein injection mix was prepared with 1.30 μl of Cas9 enzyme (20 μM, New England BioLabs), 1.60 μl of prepared gRNAs, 2.5 μl of 4x Injection Buffer (containing 0.2% phenol red, 800 mM KCl, 4 mM MgCl2, 4 mM TCEP, 120 mM HEPES, pH 7.0), and 4.6 μl of Nuclease-free water. Microinjections directly into the yolk of NHGRI-1 embryos at the one-cell stage were performed as described previously 57, using needles from a micropipette puller (Model P-97, Sutter Instruments) and an air injector (Pneumatic MPPI-2 Pressure Injector). Embryos were collected and ~1 nl of ribonucleoprotein mix was injected per embryo, after previous calibration with a microruler. Twenty injected embryos per Petri dish were grown up to 5 dpf at 28°C.
Illumina and Sanger amplicon sequencing
DNA extractions were performed on 20 pooled embryos by adding 100 μl of 50 mM NaOH, incubation at 95°C for 20 min, ramp from 95°C to 4°C at a 0.7°C/s decrease, followed by an addition of 10 μl of 1 M Tris-HCl and a 15 min spin at 4680 rpm. We amplified a ~200 bp region surrounding the targeted site of each gRNA (see Supplementary Table 1 for description of primers). PCR amplifications were performed using 12.5 μl of 2X DreamTaq Green PCR Master Mix (Thermo Fisher), 9.5 μl of Nuclease-Free water, 1 μl of 10 μM primers, and 1 μl extracted DNA. Thermocycler program included 3 min at 95°C, followed by 35 cycles of 15 s at 95°C, 30 s at 60°C, and 20 s at 72°C, and a final 5 min incubation at 72°C. Reactions were purified using Ampure XP magnetic beads (Beckman Coulter) and Illumina sequenced (Genewiz, San Diego, CA). To obtain percent mosaicism of mutants by mapping paired-end fastq reads to the zebrafish reference genome (GRCz11/danRer11) using bwa 57 and the R package CrispRVariants 35. Additionally, we amplified a ~500 bp region surrounding the targeted site of each gRNA from the same extracted DNA for six gRNAs and performed and performed Sanger sequencing (Genewiz, San Diego, CA). Raw trace files were used in the TIDE 33 and ICE 34 tools to predict the percentage of indels, which we used as our in vivo editing score for each gRNA. For both Sanger and Illumina sequencing, we used uninjected batch-sibling embryos as a control reference.
PAGE and intensity-ratio estimation
An empirical cleavage analysis from each gRNA was performed using PAGE. Briefly, we amplified a ~200 bp region in DNA around the targeted site from gRNA-injected and uninjected embryos, as described above. Reactions of the uninjected and injected samples from the same amplicon were run on a 7.5% polyacrylamide gel together for 75 min at 110 V and revealed using GelRed (VWR International). Gel images were processed in the software Fiji 58. For each sample, we defined areas A and B as follows:
For each gRNA, the mean-intensity value was obtained for the A and B areas in both the injected and uninjected samples. The A and B areas were exactly the same size between samples. The intensity ratio was calculated as: injected B / injected A / uninjected B / uninjected A. Log-normalized intensity ratios followed a normal distribution (Shapiro-Wilk test: W= 0.96, p= 0.167) with an average value of 1.21±0.70.
CIRCLE-seq libraries were prepared for each gRNA (IDT) using genomic DNA extracted from NHGRI-1 (DNA Blood & Tissue kit, Qiagen) following the described protocol 59. Libraries were sequenced using one HiSeq XTen lane (Novogene, Sacramento, CA), providing an average of 7.3 million reads (range: 4.0 - 13.3 million reads) and >Q30 for 92% of reads per gRNA library. Raw reads were processed using the bioinformatic pipeline described 59 (mapping rate >99% in all samples) to identify regions with cutting events relative to a control sample (treated with Cas9 enzyme and no gRNA). In an attempt to obtain an on-target efficiency estimation from in vitro digestions, we calculated the reads per million normalized (RPMN). For this purpose, we used samtools 60 to extract read coverage from aligned bam files. For each gRNA, coverage was obtained for the third and fourth base upstream of the PAM site as it is the region expected to be cut by Cas9 61. RPMN for each gRNA was calculated as the sum of coverage at these two sites divided by the total mapped reads per sample and multiplied by one million to scale the values. RPMN scores ranged from 4.42 to 881 (median 99.3) so we decided to use a log normalization to reduce this range.
We performed RNA-seq of Cas9 injected NHGRI-1 larvae to identify potential gRNA-independent cleavage sites. One-cell stage NHGRI-1 embryos were injected with either Cas9 enzyme or Cas9 mRNA. Injection mix for Cas9 enzyme included Cas9 enzyme (20 μM, New England BioLabs), 2.5 μl of 4x Injection Buffer (0.2% phenol red, 800 mM KCl, 4 mM MgCl2, 4 mM TCEP, 120 mM HEPES, pH 7.0), and Nuclease-free water. Cas9 mRNA was obtained from plasmid pT3TS-nCas9n (Addgene, plasmid #46757) 5, using the MEGAshotscript T3 transcription kit (Thermo Fisher) following manufacturer’s guidelines of 3.5 h 56°C incubation with T3. mRNA was purified with the MEGAclear transcription clean-up kit (Thermo Fisher) and concentration of mRNA obtained using a NanoDrop (Thermo Fisher). The injection mix of Cas9 mRNA contained 100 ng/μl of mRNA, 4x Injection Buffer (0.2% phenol red, 800 mM KCl, 4 mM MgCl2, 4 mM TCEP, 120 mM HEPES, pH 7.0), and Nuclease-free water.
Additionally, uninjected batch-siblings and uninjected siblings from an additional batch were used as controls. All embryos were grown at 28°C in a density of <50 embryos per dish. At 5 dpf, three pools of five larvae were collected for each group (Cas9 enzyme, Cas9 mRNA, and uninjected) for RNA extraction using the RNeasy kit (Qiagen) with genomic DNA eliminator columns for DNA removal. Whole RNA samples were subjected to RNA-seq using the poly-A selection method (Genewiz, San Diego, CA).
Variant identification from RNA-seq data
We followed a previously described pipeline to identify somatic variants from RNA-seq data 48. Briefly, we mapped reads with STAR 62 using the 2-pass mode and a genomic reference created with GRCz11/danRer11 assembly and gtf files (release version 100). Variant calling was performed with MuTect2 as part of GATK 56 using the tumor versus normal mode. ‘Normal’ was defined by the two uninjected samples to identify all somatic mutations in our Cas9 injected embryos. Variants were annotated using the Variant Effect Predictor tool 63. High confidence variants (minimum sequencing depth of 20) previously reported for the NHGRI-1 line 49 were removed. Only frameshift loss-of-function variants with a minimum read depth of 20 in canonical protein-coding genes were considered. We extracted the median distance between the identified variants and the nearest Cas9 PAM site (NGG sequence) using the coordinates in the CRISPRScan UCSC track. This median observed distance was compared to the result of median distances of 10,000 permutations of random sampling across the genome and their nearest PAM site. One-tailed empirical p values from this comparison were calculated as (M+N)/(N+1), where M is the number of iterations with a median distance below the observed value and N is the total number of iterations. We orthogonally investigated the presence of variants in 23 genes via Illumina sequencing of a ~200 bp region surrounding the identified variant location and the R package CrispRVariants 50 (Supplementary Table 1 for primers description). For this purpose, we extracted DNA from 3 pools of 5 embryos injected with Cas9 enzyme, Cas9 mRNA, dCas9 (Alt-R S. p. dCas9 protein V3 from IDT), a scrambled gRNA (see Supplementary Table 1 for sequence description), or uninjected. In addition, we extracted DNA from a finclip of the crossing parents of the embryos used for the injections (both female and male). In all of these groups, we quantified the percentage of mutations as all alleles different from the reference.
Differential gene expression analysis from RNA-seq data
Raw reads were processed using the elvers (https://github.com/dib-lab/elvers; version 0.1, release DOI: 10.5281/zenodo.3345045) bioinformatic pipeline that utilizes fastqc 64, trimmomatic 65, and salmon 66 to obtain the transcripts per kilobase million (TPM) for each gene. DESeq2 67 was used to extract differentially-expressed genes in the Cas9 enzyme or Cas9 mRNA injected samples relative to the uninjected larvae. R package clusterProfiler 68 was used to perform enrichment tests of differentially-expressed genes in biological pathways. Network analyses of the common differential expressed genes was performed using the NetworkAnalyst online tool (www.networkanalyst.ca) 69, 70.
All analyses were performed in R version 4.0.2 71. Normality of variables was checked using the Shapiro-Wilk test and parametric or nonparametric comparisons made accordingly. Spearman correlation tests (denoted as ρ) and linear regression models were used to determine the relationship between variables. All analyses compared across different experimental batches included batch as a factor in the model to prevent biases caused by inter-batch differences. Averages include the standard deviation unless otherwise specified. Alpha to determine significance across the different tests was set at 0.05 unless otherwise specified. Additional R packages used for making figures included eulerr 72 and pheatmap 73.
Article TitleEvaluation of CRISPR gene-editing tools in zebrafish
Background Zebrafish have practical features that make them a useful model for higher-throughput tests of gene function using CRISPR/Cas9 editing to create ‘knockout’ models. In particular, the use of G0 mosaic mutants has potential to increase throughput of functional studies significantly but may suffer from transient effects of introducing Cas9 via microinjection. Further, a large number of computational and empirical tools exist to design CRISPR assays but often produce varied predictions across methods leaving uncertainty in choosing an optimal approach for zebrafish studies.
Methods To systematically assess accuracy of tool predictions of on- and off-target gene editing, we subjected zebrafish embryos to CRISPR/Cas9 with 50 different guide RNAs (gRNAs) targeting 14 genes. We also investigate potential confounders of G0-based CRISPR screens by screening control embryos for spurious mutations and altered gene expression.
Results We compared our experimental in vivo editing efficiencies in mosaic G0 embryos with those predicted by eight commonly used gRNA design tools and found large discrepancies between methods. Assessing off-target mutations (predicted in silico and in vitro) found that the majority of tested loci had low in vivo frequencies (<1%). To characterize if commonly used ‘mock’ CRISPR controls (larvae injected with Cas9 enzyme or mRNA with no gRNA) exhibited spurious molecular features that might exacerbate studies of G0 mosaic CRISPR knockout fish, we generated an RNA-seq dataset of various control larvae at 5 days post fertilization. While we found no evidence of spontaneous somatic mutations of injected larvae, we did identify several hundred differentially-expressed genes with high variability between injection types. Network analyses of shared differentially-expressed genes in the ‘mock’ injected larvae implicated a number of key regulators of common metabolic pathways, and gene-ontology analysis revealed connections with response to wounding and cytoskeleton organization, highlighting a potentially lasting effect from the microinjection process that requires further investigation.
Conclusion Overall, our results provide a valuable resource for the zebrafish community for the design and execution of CRISPR/Cas9 experiments.