Fitness effects of CRISPR endonucleases in Drosophila melanogaster populations

Plasmid construction

The starting plasmid pDsRed (Addgene plasmid #51019) was provided by Melissa Harrison, Kate O’Connor-Giles, and Jill Wildonger, pnos-Cas9-nos (61) (Addgene plasmid #62208) was provided by Simon Bullock, and VP12 (41) (Addgene plasmid #72247) was provided by Simon Bullock. Starting plasmids ATSacG, TTTgRNAtRNAi, TTTgRNAt, BHDgN1c, and BHDgN1cv3 were constructed in a previous study (45). Restriction enzymes for plasmid digestion, Q5 Hot Start DNA Polymerase for PCR, and Assembly Master Mix for Gibson assembly were acquired from New England Biolabs. Oligonucleotides and gBlocks were obtained from Integrated DNA Technologies. JM109 competent cells and ZymoPure Midiprep kit from Zymo Research were used to transform and purify plasmids. Cas9 gRNA target sequences were identified by the use of CRISPR Optimal Target Finder (62). A list of DNA fragments, plasmids, primers, and restriction enzymes used for cloning of each construct can be found in the Supplemental Information, together with annotated sequences of the final drive insertion plasmids (ApE format,

Generation of transgenic lines

Injections were conducted by Rainbow Transgenic Flies. The donor plasmid (Cas9gRNAs, Cas9_no-gRNAs, no-Cas9_no-gRNAs, Cas9HF1_gRNAs, or BHDgNf1v2) (~500 ng/μL) was injected along with plasmid BHDgg1c (or TTTgU1 for BHDgNf1v2) (45) (~100 ng/μL), which provided additional gRNAs for transformation, and pBS-Hsp70-Cas9 (~500 ng/μL, from Melissa Harrison & Kate O’Connor-Giles & Jill Wildonger, Addgene plasmid #45945) providing Cas9. A 10 mM Tris-HCl, 100 μM EDTA solution at pH 8.5 was used for the injection. Most constructs were injected into _w1118 flies, but BHDgNf1v2 was injected into flies with ATSacG (45). Transformants were identified by the presence of DsRed fluorescent protein in the eyes, which usually indicated successful construct insertion.

Maintenance of transgenic flies with active Cas9HF1 gene drive

To minimize risk of accidental release, all flies with an active homing gene drive system were kept at the Sarkaria Arthropod Research Laboratory at Cornell University under Arthropod Containment Level 2 protocols in accordance with USDA APHIS standards. In addition, the synthetic target site drive system (30) prevents drive conversion in wild-type flies, which lack the EGFP target site. All safety standards were approved by the Cornell University Institutional Biosafety Committee.

Experimental fly populations

The experimental fly populations were maintained on Bloomington Standard medium in 30×30×30 cm fly cages (Bugdorm). Flies were kept at constant temperature (25°C, 14 hours light, 10 hours dark), with non-overlapping generations. 0 – 2 day-old flies of one generation were allowed to lay eggs on fresh medium (8 food bottles per cage) for 24 hours. After that, the adults were frozen at −20°C for later phenotyping, and the new generation was allowed to develop for 11-12 days, before fresh medium was provided and a new generation cycle starts. The ancestral generation of each cage was generated by allowing homozygous EGFP flies and flies homozygous for the construct to deposit eggs for 24 hours separately from each other in four food bottles each. These eight egg-containing bottles were put in the fly cages to start one experimental fly population. Seven replicates of Cas9_gRNAs, and two replicates each for Cas9_no-gRNAs, no-Cas9_no-gRNAs, and Cas9HF1_gRNAs were maintained.

Phenotyping experimental fly populations

The dominant fluorescent markers, EGFP and DsRed, allow a direct readout of the genotype by screening the fluorescent phenotype of an individual fly. Flies that are only red fluorescent are construct homozygotes, flies that are only green fluorescent do not carry any construct, and flies that are fluorescent for both colors carry one construct copy.

For each experimental population and generation, all individuals were screened for their genotypes using either a stereo dissecting microscope in combination with the NIGHTSEA system, or an automated image-based screening pipeline we specifically developed for this purpose. Quantifying phenotypic traits (e.g. pupae size, the amount of laid eggs) in an automated way has been done successfully before in Drosophila (63, 64). In our image-based screening pipeline, three pictures were taken for each batch of flies: a white light picture to determine the number and the position of the flies, one fluorescent picture filtered to screen for DsRed, and one fluorescent picture filtered to screen for EGFP expression.

We used a Canon EOS Rebel T6 with a 18-55 mm lens for image acquisition. The camera was held in a fixed position by a bracket 25 cm above the frozen flies spread on a black poster board. NIGHTSEA light heads (Green and Royalblue) were used as light sources. The light sources both for white and fluorescent light were covered with a paper tissue for diffusion. For the fluorescent pictures, barrier filters (Tiffen 58 mm Dark Red #29; Tiffen 58 mm Green #58) were used, attached with a magnetic XUME Lens/Filter system to the camera. Except for the filter change, the camera was fully controlled through a PC interface (EOS Utility 2 software). Focus was set automatically under white light and was kept constant for the fluorescent pictures. First, a white light picture (F 5.6, ISO 100, exposure time 1’’) was taken to determine the number and positions of the flies. Second, a picture under NIGHTSEA Green light with the Tiffen Dark red #29 filter (F 5.6, ISO 400, exposure time 30’’) was taken to determine, whether flies express DsRed. Third, a picture under NIGHTSEA Royal Blue with the Tiffen Dark Green #58 barrier filter (F 5.6, ISO 400, exposure time 25’’) was taken to screen flies for EGFP expression.

We used the ImageJ distribution Fiji (v 2.0.0-rc-69/1.52p) (65, 66) to process and analyze the picture sets with an in-house ImageJ macro: The three multi-channel images were split into the respective red, green, and blue image components. Further analysis included the red and the green image component of the white light picture, the red image component of the red fluorescent picture, and the green image component of the green fluorescent picture. The four remaining images were merged into a stack, and we performed slice alignment (matching method: normalized correlation coefficient) based on a selected landmark using the plugin Template_Matching.jar (67). We used a rectangular piece of white tape on the black poster board as landmark. To obtain the contours of the flies, we calculated the difference between the red and the green image component of the white light picture and applied a median and a Gaussian filter (radius = 3 pixels). After that, the picture was binarized using global thresholding (option: Max Entropy) (68). The binary image was post-processed (functions: Fill Holes, Open) before the position and the size of individual particles (=flies) were determined with the Analyze Particles method of ImageJ (minimum size = 750 pixels2). To account for translocations that have not been corrected for by the slice alignment (e.g., when the position of the fly changed slightly), the convex hull for each particle was calculated and enlarged by 20 pixels. A median filter (radius = 2 pixels) was applied to both fluorescent pictures before each particle (= fly) was scanned by a human investigator for the eye fluorescent pattern in both fluorescent pictures. We compared the image-based screening pipeline to the screening method using a stereo dissecting microscope and found that the estimated genotype frequencies deviate not more than 1% from each other (n = 646 flies, 4 picture sets).

Phenotype data analysis, Cas9HF1 homing gene drive

When calculating drive parameters, we pooled offspring from the same type of cross together and calculated rates from the combined counts. A potential issue of this pooling approach is that batch effects could distort rate and error estimates (offspring were raised in separate vials with different parents). To account for such effects, we performed an alternate analysis as in previous studies (32, 45) by fitting a generalized linear mixed-effects model with a binomial distribution using the function glmer and a binomial link function (fit by maximum likelihood, Adaptive Gauss-Hermite Quadrature, nAGQ = 25). This allows for variance between batches, usually resulting in different rate estimates and increased error estimates. Offspring from a single vial were considered a distinct batch. This analysis was performed using the R statistical computing environment (v3.6.1) (69) with packages lme4 (1.1-21, and emmeans (1.4.2, The R script we used for this analysis is available on Github ( The results were similar to the pooled analysis and are provided in Supplementary Data Sets S1-S2.


Flies were frozen, and DNA was extracted by grinding in 30 μL of 10 mM Tris-HCl pH 8, 1mM EDTA, 25 mM NaCl, and 200 μg/mL recombinant proteinase K (ThermoScientific), followed by incubation at 37°C for 30 minutes and then 95°C for 5 minutes. The DNA was used as a template for PCR using Q5 Hot Start DNA Polymerase from New England Biolabs. The region of interest containing gRNA target sites was amplified using DNA oligo primers AutoDLeft_S2_F and AutoDRight_S2_R. PCR products were purified after gel electrophoresis using a gel extraction kit (Zymo Research). Purified products were Sanger sequenced and analyzed with ApE (

Fitness cost estimation framework

To estimate the fitness costs of the different transgenic constructs in our D. melanogaster cage experiments, we modified a previously developed maximum likelihood inference framework (42). Specifically, we extended the original model to a two-locus model, where the first locus represents the construct insertion site and the second locus represents an idealized cut site. In this model, cleavage at the cut site could represent in principle the effects of non-specific DNA modifications (“off-target” effects) as well as the effects of cleavage at the desired gRNA target site (i.e., target site activity). However, the latter is not expected to impose any fitness costs for our constructs due to the intergenic location of the target site. Thus, we refer to the idealized cut site as “off-target” site. At the construct locus, the two possible allele states are EGFP/construct (observed by fluorescence); at the off-target site, the two possible states are uncut/cut (not directly observed). The two loci are assumed to be autosomal and unlinked. Thus, there are nine possible genotype combinations an individual could have in our model. Unless stated otherwise, we assumed that the construct homozygotes used for the ancestral generation of a cage are cut/cut homozygotes at the idealized off-target site. Since the construct is not homing, the allelic state of a single individual cannot change at the construct locus. By contrast, the allelic state at the off-target locus can be altered by cutting events in the germline or in the early embryo phase. Germline cutting will only impact the genotype of offspring in the next generation, while embryo cutting will directly change the individual’s genotype and hence expose it to any potential fitness effects of this new genotype. Both the germline and embryo off-target cut rates were set to 1 in our model. This means that any uncut allele at the off-target locus will be cut in the germline if the individual carries at least one construct allele (germline cute rate = 1). Furthermore, individuals will become cut/cut homozygotes if their mother carried a least one construct allele (embryo cute rate = 1; we assume that maternally deposited Cas9/gRNA is present in all such embryos).

A full inference model for the potential fitness costs of construct alleles and cut off-target alleles that includes all three previously implemented types of selection (mate choice, fecundity, viability) would feature a vast number of parameters that would be difficult to disentangle (42). For simplicity and to avoid overfitting, we therefore reduced model complexity with a series of assumptions: First, potential fitness costs were assumed to be equal for both sexes. Second, we either included only viability selection in the model, or included only mate choice (i.e., relative mating success for males with a particular genotype, reference value = 1) and fecundity selection (i.e., relative fecundity for females with a particular genotype, reference value = 1), both of equal magnitude. We further considered all fitness effects to be multiplicative across the two loci and for the two alleles at each locus (e.g., a construct homozygote would have a fitness equal to the square of a construct/EGFP heterozygote, given the same genotype at the off-target site). This results in two much more tractable inference models (viability and fecundity/mate choice) with only three parameters overall: the effective population size (Ne), the relative fitness of construct/EGFP heterozygotes versus EGFP homozygotes (the “direct fitness parameter”), and the relative fitness of cut/uncut heterozygotes versus uncut homozygotes (the “off-target fitness parameter”).

Article TitleFitness effects of CRISPR endonucleases in Drosophila melanogaster populations


CRISPR/Cas9 systems provide a highly efficient and flexible genome editing technology with numerous potential applications in areas ranging from gene therapy to population control. Some proposed applications involve CRISPR/Cas9 endonucleases integrated into an organism’s genome, which raises questions about potentially harmful effects to the transgenic individuals. One application where this is particularly relevant are CRISPR-based gene drives, which promise a mechanism for rapid genetic alteration of entire populations. The performance of such drives can strongly depend on fitness costs experienced by drive carriers, yet relatively little is known about the magnitude and causes of these costs. Here, we assess the fitness effects of genomic CRISPR/Cas9 expression in Drosophila melanogaster cage populations by tracking allele frequencies of four different transgenic constructs, designed to disentangle direct fitness costs due to the integration, expression, and target-site activity of Cas9 from costs due to potential off-target cleavage. Using a maximum likelihood framework, we find a moderate level of fitness costs due to off-target effects but do not detect significant direct costs. Costs of off-target effects are minimized for a construct with Cas9HF1, a high-fidelity version of Cas9. We further demonstrate that using Cas9HF1 instead of standard Cas9 in a homing drive achieves similar drive conversion efficiency. Our results suggest that gene drives should be designed with high-fidelity endonucleases and may have implications for other applications that involve genomic integration of CRISPR endonucleases.

Login or Signup to leave a comment
Find your community. Ask questions. Science is better when we troubleshoot together.
Find your community. Ask questions. Science is better when we troubleshoot together.

Have a question?

Contact or check out our support page.