Materials and Methods

A Universal, Genomewide GuideFinder for CRISPR/Cas9 Targeting in Microbial Genomes

MATERIALS AND METHODSGuideFinder implementation. GuideFinder is written in the R programming language and is available free to use. GuideFinder was written such that it can be used to find guides for both complete and draft genomes, with the recognition that many users may not have a complete genome for the organism of interest. The workflow of the program, including inputs and outputs, is outlined in Fig. 1.Inputs and outputs. (i) Inputs. GuideFinder is capable of designing guides for both complete and draft genomes, although the inputs differ slightly.Complete genome. For complete genomes, users simply supply the GenBank accession number and FASTA file.Draft genome. Given the variable organization and notation of draft genomes, annotated draft genome files must be preprocessed prior to inputting into the program. Utilizing the supplied preprocessing script, multisequence FASTA files (e.g., FASTA files containing sequence information for multiple contigs) must be concatenated into a single sequence, with the addition of a series of N’s between contigs. The coordinates of the coding sequences are then identified by aligning the coding sequences against the concatenated FASTA file using BLAST and adjusted to the format required by the main GuideFinder script (i.e., the smaller coordinate, designated the “start” coordinate). These coordinates are then inputted into the main script, along with the single-sequence FASTA file.(ii) Outputs. There are two main outputs of the GuideFinder program: top hits and paired guides lists. Intermediate outputs, such as a list of all possible unfiltered guides, are also made available to the user for reference.Top hits list. The top hits list is a list of guides preferentially selected based on their proximity to the transcription start site. The maximum number of guides supplied per gene is set by the user.Paired guides list. The paired guides list is a list of guide pairs designed to doubly target the same gene in the same cell to increase targeting efficiency. Suitable guide pairs are selected on the basis of the distance between the guides, a parameter set by the user.Program Workflow. (i) Coordinate identification. The identification of gene start and end coordinates is the first step in the GuideFinder workflow, and the methods differ slightly for complete versus draft genomes. For complete genomes, the script reads in the annotated genome file containing the gene coordinates and modifies the coordinates to include the putative promoter region. For draft genomes, the coordinates—identified during preprocessing—are directly inputted into the program and modified to include the putative promoter region.(ii) Coding and promoter sequence retrieval. The gene start and end coordinates are used to retrieve the coding and putative promoter sequences from the FASTA file.(iii) Guide creation. Searching within the promoter and gene body, the program identifies NGG PAM sites and utilizes the sequences around each site to create three guides (of lengths 20 bp, 21 bp, and 22 bp) per PAM site. The selection of various guide lengths increases the number of potential guides, many of which are lost to filtering, as described below.(iv) Guide filtering. Guides are filtered according to default and user-defined parameters. By default, the program removes any guides that contain a homopolymer run of A’s or T’s and guides of inadequate length (<20 bp). A user-set threshold is used for filtering based on the maximum distance from the start site, as the targets closest to the transcriptional start site are the ones most likely to disrupt gene function. Guides can be optionally filtered user-set “bad seed” or restriction enzyme sequences and used to minimize off-target effects. For off-target filtering, the first 12 nucleotides (nt) closest to and including the PAM site for each guide are aligned to the FASTA file, and guides that correspond to two or more locations in the genome are discarded. While the sequence consisting of the first 12 nt of the PAM sequence—the seed sequence—represents an established parameter for importance in off-target searching (25), off-target prediction should be experimentally validated for each bacterial species tested prior to large-scale guide design, as differences between species likely exist.(v) Final guide selection. For each PAM site, the program selects the guide of the greatest length that meets the GC minimum set by the user. From these guides, two final guide lists are created, i.e., top hits and paired guides lists, which provide guides and guide pairs suitable for single-gene and dual-gene knockdown, respectively.(vi) Iteration. The program identifies genes that did not produce any guides with the primary parameters. Users have the option to lower these thresholds and re-run these genes through the program to identify additional guides. Users can elect to reduce the GC minimum, increase the maximum guide distance from the transcription start site, retain guides that contain homopolymers, and relax off-target searching. Users can relax each of these guide design constrains individually or in combination.Knockdown strain creation. For both species, knockdown strains were created as follows. For single-guide experiments, a single guide was designed by the GuideFinder program for targeting each gene. For triple-guide targeting (i.e., using three guides expressed in the same cell), the top three guides closest to the TSS were chosen for targeting each gene. The single-guide or triple-guide construct was ligated into our custom CRISPR/dCas9 shuttle vector. Our CRISPR/dCas9 shuttle vector includes all of the necessary components for CRISPRi, including dCas9 (derived from pDB114dCas9 26) under the control of an anhydrotetracycline (ATc)-inducible promoter (derived from pRAB11 27), dCas9 handle (CRISPR RNA crRNA and trans-activating small RNA tracrRNA fusion, custom designed), and a chloramphenicol (Cm) resistance maker (for selection). The triple-guide targeting vector is a modified version of our CRISPR/dCas9 single vector that enables insertions of multiple guides.The shuttle vectors containing the proper targeting guides were transformed into E. coli, and the resultant colonies were screened for the guide sequence. A single positive-testing colony was grown in Trypticase soy broth (TSB) with chloramphenicol (TSM/Cm) overnight, and, using a QIAprep Spin Miniprep kit, plasmids were isolated from E. coli and transformed into the staphylococcal species of interest. For S. aureus, plasmids were transformed into competent S. aureus RN4220 cells via electroporation. For S. epidermidis, phagemid transfer was utilized to incorporate the plasmid into S. epidermidis strain Tu3298, according to a protocol previously described elsewhere (28).For multigene targeting, using host strain E. coli HME63, a gift from Donald Court of the NIH which possess an ampicillin (Amp) resistance gene (CRISPRi target 1), we performed transformations with a plasmid bearing a kanamycin resistance gene (Kan) and a constitutive GFP gene (CRISPRi target 2). We then cloned either single guides targeting each gene individually or double guides targeting both genes simultaneously into our modified dCas9 vector. Six independent transformations were performed, the results were plated onto the appropriate selective agar plates, and CFU counts were performed for the Amp guide group. Single colonies from the GFP guide group were grown, and the fluorescence of the cultures was measured with a Cytation 3 imaging reader (BioTek).Growth assays. Growth assays were performed to assess knockdown of essential genes. The growth assays were performed in both S. aureus and S. epidermidis as follows. A single colony of each knockdown strain was grown overnight in TSB containing chloramphenicol. The overnight culture was diluted to an optical density (OD) of 0.05 in TSB/Cm, grown to an OD of 0.5, and diluted again at the start of the assay to an OD of 0.05 with TSB/Cm (control group) or TSB/Cm–0.1 μM anhydrotetracycline (inducer, experimental group). The cultures were grown for 16 h, with OD measurements taken each half-hour to construct a growth curve for each knockdown strain. For each strain, the induced/experimental group growth curve was compared to the uninduced/control group curve. Knockdown of most of the essential genes resulted in a severe growth defect, as expected. The knockdown of two genes, groEL and rpoc, did not result in the expected growth defect, and we investigated the ability of each guide to reduce transcript levels.Measuring transcript levels. In S. aureus, we measured transcript levels of groEL and rpoc growing in liquid media to determine if the selected guide was capable of reducing transcript levels. A single colony of each groEL and rpoc knockdown S. aureus strain was grown overnight in TSB/Cm at 37°C with shaking. The overnight culture was back-diluted to an OD of 0.05 and was grown at 37°C until an optical density at 600 nm (OD600) of 0.5 was reached. The culture was back-diluted again to an OD of 0.05 with TSB containing chloramphenicol and 0.1 μM anhydrotetracycline and was grown for 1.5 h; readings were taken at time points throughput the assay (h 0, 0.5, 1, and 1.5). An aliquot taken at each time point was mixed with 2 volumes of RNA Protect and incubated for 5 min at room temperature. The aliquot was spun down, and the supernatant was decanted and stored at −20°C until RNA extraction. RNA from the four time points was extracted according to the protocol for an RNeasy Plus kit, with an added enzymatic digestion step performed using lysozyme and lysostaphin for lysis of the Gram-positive species S. aureus. RNA was reversed transcribed to create cDNA by the use of a High-Capacity cDNA reverse transcription kit (Applied Biosystems), according to provided instructions. Quantitative PCR (qPCR) was performed using PowerUp SYBR green master mix (Applied Biosystems) in conjunction with gene-specific primers. Primers amplifying the ftsZ gene were used as an internal control, and nontemplate controls were included. Duplicate qPCR reactions were performed for each assay as a technical replicate.The genomes used for draft genome analysis were obtained from PATRIC. The strain used and the genome identifier (ID) numbers are as follows: Micrococcus luteus ATCC 12698 (Genome ID 1270.61), Micrococcus luteus O’Kane (Genome ID 1270.50), Staphylococcus aureus WBG10049 (Genome ID 585160.3), Staphylococcus aureus SA14-296 (Genome ID 46170.233), Staphylococcus epidermidis NLAE-zl-G239 (Genome ID 1282.2004), and Staphylococcus epidermidis FDAARGOS_83 (Genome ID 1282.1163).Availability of and requirements for GuideFinder. Details of the availability of and requirements for GuideFinder are as follows: project name, GuideFinder; project home page, https://github.com/ohlab/Guide-Finder; operating system(s), Mac, Windows; programming language, R. Other requirements are as follows: license, none; restrictions for use by nonacademics, none.Data availability. The genomes used for complete genome analysis were obtained from NCBI. The accession numbers for each strain are as follows: for Lactobacillus brevis, GenBank accession no. {"type":"entrez-nucleotide","attrs":{"text":"CP000416.1","term_id":"116098028","term_text":"CP000416.1"}}CP000416.1; for Lactobacillus jensenii, GenBank accession no. {"type":"entrez-nucleotide","attrs":{"text":"CP018809.1","term_id":"1127841453","term_text":"CP018809.1"}}CP018809.1; for Staphylococcus epidermidis, GenBank accession no. {"type":"entrez-nucleotide","attrs":{"text":"AE015929.1","term_id":"27316888","term_text":"AE015929.1"}}AE015929.1; for Staphylococcus aureus, GenBank accession no. {"type":"entrez-nucleotide","attrs":{"text":"CP000253.1","term_id":"87201381","term_text":"CP000253.1"}}CP000253.1; for Rhizobium leguminosarum, GenBank accession no. {"type":"entrez-nucleotide","attrs":{"text":"CP007045.1","term_id":"573465242","term_text":"CP007045.1"}}CP007045.1; for Pseudomonas aeruginosa, GenBank accession no. {"type":"entrez-nucleotide","attrs":{"text":"AE004091.2","term_id":"110227054","term_text":"AE004091.2"}}AE004091.2; for Mycobacterium tuberculosis, GenBank accession no. {"type":"entrez-nucleotide","attrs":{"text":"AL123456.3","term_id":"444893469","term_text":"AL123456.3"}}AL123456.3; for Micrococcus luteus, GenBank accession no. {"type":"entrez-nucleotide","attrs":{"text":"CP001628.1","term_id":"239837778","term_text":"CP001628.1"}}CP001628.1; for Streptomyces scabiei, GenBank accession no. {"type":"entrez-nucleotide","attrs":{"text":"FN554889.1","term_id":"260644157","term_text":"FN554889.1"}}FN554889.1 .

Article TitleA Universal, Genomewide GuideFinder for CRISPR/Cas9 Targeting in Microbial Genomes

Abstract

The genomes used for complete genome analysis were obtained from NCBI. The accession numbers for each strain are as follows: forLactobacillus brevis, GenBank accession no.{"type":"entrez-nucleotide","attrs":{"text":"CP000416.1","term_id":"116098028","term_text":"CP000416.1"}}CP000416.1; forLactobacillus jensenii, GenBank accession no.{"type":"entrez-nucleotide","attrs":{"text":"CP018809.1","term_id":"1127841453","term_text":"CP018809.1"}}CP018809.1; forStaphylococcus epidermidis, GenBank accession no.{"type":"entrez-nucleotide","attrs":{"text":"AE015929.1","term_id":"27316888","term_text":"AE015929.1"}}AE015929.1; forStaphylococcus aureus, GenBank accession no.{"type":"entrez-nucleotide","attrs":{"text":"CP000253.1","term_id":"87201381","term_text":"CP000253.1"}}CP000253.1; forRhizobium leguminosarum, GenBank accession no.{"type":"entrez-nucleotide","attrs":{"text":"CP007045.1","term_id":"573465242","term_text":"CP007045.1"}}CP007045.1; forPseudomonas aeruginosa, GenBank accession no.{"type":"entrez-nucleotide","attrs":{"text":"AE004091.2","term_id":"110227054","term_text":"AE004091.2"}}AE004091.2; forMycobacterium tuberculosis, GenBank accession no.{"type":"entrez-nucleotide","attrs":{"text":"AL123456.3","term_id":"444893469","term_text":"AL123456.3"}}AL123456.3; forMicrococcus luteus, GenBank accession no.{"type":"entrez-nucleotide","attrs":{"text":"CP001628.1","term_id":"239837778","term_text":"CP001628.1"}}CP001628.1; forStreptomyces scabiei, GenBank accession no.{"type":"entrez-nucleotide","attrs":{"text":"FN554889.1","term_id":"260644157","term_text":"FN554889.1"}}FN554889.1.


Login or Signup to leave a comment
Find your community. Ask questions. Science is better when we troubleshoot together.
Find your community. Ask questions. Science is better when we troubleshoot together.

Have a question?

Contact support@scifind.net or check out our support page.