Unless otherwise stated, all reagents were provided by Sigma. All primers/oligonucleotides and synthetic DNA used in this study are listed in Tables S5 and S6, respectively. The IA-alkyne probe, Azo-L and Azo-H tags were synthesized as previously described1,2.
Cell culture and parasite isolation
RH strain T. gondii tachyzoites were cultured by serial passage on confluent monolayers of human foreskin fibroblasts (HFF-1 ATCC® SCRC-1041™). HFFs were grown at 37°C and 5% CO2 in Dulbecco’s Modified Eagle’s Medium (DMEM) supplemented with 10% (v/v) heat-inactivated foetal bovine serum (FBS), 100 μg/ml penicillin/streptomycin and 2 mM L-glutamine. Unless otherwise stated, parasites were harvested for assays or transfection via mechanical syringe lysis of heavily infected HFFs through a 25-gauge needle.
Highly synchronized 3D7 strain P. falciparum asexual parasites were cultured in RPMI-1640 medium supplemented with 0.5% (w/v) AlbuMAX™ II (Life Technologies), 50 μg/ml hypoxanthine, 25 μg/l gentamycin and 0.3 mg/ml L-glutamine. Parasites were routinely cultured at 37°C and 5% CO2/3% O2 with 2% hematocrit blood (NHS UK Blood Transfusion Service). Media was exchanged daily until the culture reached 10-20% parasitemia with predominantly late trophozoites and early schizonts. Infected red blood cells (RBCs) were isolated by centrifugation (800 × g, 5 min) and lysed in RBC lysis buffer (45 mM HEPES pH 7.45, 100 mM potassium acetate, 1.5 mM magnesium acetate, 2 mM DTT and 0.075% (w/v) saponin) for 10 min at room temperature. The lysed RBCs were then centrifuged (2,800 × g and 4°C, 10 min), and the resulting parasite pellet was suspended in cell lysis buffer (45 mM HEPES pH 7.45, 100 mM potassium acetate, 1.5 mM magnesium acetate, 2 mM DTT). This step was repeated until all RBC debris was removed.
HEK 293F cells were cultured in FreeStyle™ 293 Expression Medium (Life Technologies) at 37°C and 5% CO2. Cells were harvested at a density of ~2×106/ml by centrifugation (1000 × g for 10 min at 4°C) and washed once in cell lysis buffer supplemented with 20U of human placental RNase inhibitor and cOmplete™ EDTA-free Protease Inhibitor Cocktail (Roche) prior to processing lysates. All parasite and host cell strains were confirmed negative for the presence of Mycoplasma contamination by PCR.
Plasmid design and construction
To construct pG140::Tg_Hypo-3×HA, a recodonized _TgHypo cDNA sequence fused to a C-terminal 3×HA tag was synthesized by GeneArt (Life Technologies). This fragment was cloned into the Bam_HI and _Hind_III sites of a modified version of the parental plasmid p5RT70loxPKillerRedloxPYFP-HX3, in which the _TUB8 promoter had been deleted using the Q5 Site-Directed Mutagenesis Kit (NEB) protocol with primers P1/P2. Next, fragments encompassing the TgHypo 5’ or 3’ UTR were PCR amplified from genomic DNA of RHdiCreΔku80Δhxgprt parasites using primers P3/P4 and P5/P6, respectively. The 5’ UTR fragment was cloned into the _Nar_I site of the intermediate plasmid, followed by the 3’ UTR fragment at the _Sac_I site to generate pG140::_Tg_Hypo-3×HA.
To construct pSAG1::Cas9-U6::sgTg_Hypo(×2), Cas9 sgRNA sequences targeting the _TgHypo 5’ or 3’ UTR were first selected using the Eukaryotic Pathogen gRNA Design Tool (EuPaGDT)4. Two single gRNA vectors containing either the 5’ or 3’ UTR-targeting gRNA were then generated using the pSAG1::Cas9-U6::sgUPRT plasmid as a backbone (Addgene #54467)5. Here, the parental UPRT-targeting gRNA was replaced with either _Tg_Hypo gRNA using the Q5 Site-Directed Mutagenesis Kit protocol with primers P7/P9 (5’ gRNA) and P8/P9 (3’ gRNA). Next, a fragment encompassing the 5’ gRNA was PCR amplified using primers P10/P11 and Gibson cloned6 into the other _Kpn_I and _Xho_I-digested 3’ gRNA plasmid, generating pSAG1::Cas9-U6::sg_Tg_Hypo(×2).
All CORe plasmids were assembled by Biopart Assembly Standard for Indempotent Cloning (BASIC)7. To construct the pCORe recipient vector, three DNA parts (a Cas9 nuclease, hxgprt selectable marker and an mScarlett counterselection cassette) were generated with flanking BASIC Prefix and Suffix sequences. The Cas9 part was generated via PCR amplification of pCas9/Decoy (Addgene #80324)8 using primers P12/P13. The mScarlett part was synthesized by Twist (www.twistbioscience.com). The hxgprt part was amplified from pTUB1:YFP-mAID-3HA, DHFR-TS:HXGPRT (Addgene #87259)9 using primers P14/P15. Prior to amplification, two internal Bsa_I sites in the DHFR UTRs of the _hxgprt cassette were removed using the Q5 Site-Directed Mutagenesis Kit with primers P16/P17 and P18/P19. The resulting DNA parts were cloned into an ampR-p15A backbone in a four-part BASIC reaction, forming pCORe. All BASIC linkers used in the assemblies were synthesized by Biolegio and are listed in Table S6.
All transfections were performed by electroporation using an Amaxa 4D-Nucleofector (Lonza) with program ‘F1-115’. Transfections were carried out using freshly harvested extracellular tachyzoites in P3 buffer (5 mM KCl, 15 mM MgCl2, 120 mM Na2HPO4/NaH2PO4 pH 7.2, 50 mM D-mannitol).
Stable parasite line generation
To generate the inducible knockout strain for Tg_Hypo (here referred to as RH _Tg_HypoiKO), 10 μg of _Sca_I-linearised pG140::_Tg_Hypo-3×HA was co-transfected with 10 μg of pSAG1::Cas9-U6::sg_Tg_Hypo(×2) into 5×106 RHdiCreΔ_ku80Δhxgprt parasites10. Transgenic parasites were selected with 25 μg/μl mycophenolic acid (MPA) and 50 μg/μl xanthine (XAN) 24 hours post-transfection, and individual resistant clones were obtained by limiting dilution. Successful 5’ and 3’ integration of the DNA construct at the endogenous _Tg_Hypo locus was confirmed by PCR using primer P20/P21 and P22/P23, respectively. Disruption of the endogenous _Tg_Hypo locus was confirmed using primers P24/P25. Rapamycin-induced excision of the integrated _Tg_Hypo iKO construct was verified using primers P26/P27.
Inducible knockout of TgHypo
Confluent HFF monolayers in T25 flasks were infected with ~2-5×106 parasites for 4 hours prior to treatment with 50 nM rapamycin or an equivalent volume of vehicle (DMSO) for 4 hours. After washout, parasites were grown for at least 24 hours prior to PCR or western blot analysis.
SDS-PAGE and western blot analysis
Extracellular parasites were lysed RIPA buffer (150 mM NaCl, 50 mM Tris-HCl (pH 8.0), 1% Triton X-100, 0.5% sodium deoxycholate, 0.1% SDS, 1 mM EDTA) supplemented with cOmplete™ Protease Inhibitor Cocktail (Roche) for 1 hour on ice. Lysates were then centrifuged (21,000 × g, 30 min at 4°C), and protein concentration in the supernatant was quantified using the Pierce™ BCA Protein Assay Kit (Thermo Scientific). Laemmli buffer was added to the lysate to 1× concentration (2% SDS, 10% glycerol, 5% 2-mercaptoethanol, 0.002% bromophenol blue and 125 mM Tris HCl, pH 6.8) and boiled (95°C, 5 min) before separation by SDS-PAGE on 12% polyacrylamide gels. Thirty micrograms of protein were typically loaded per lane. Proteins were transferred (20 V, 1 min; 23 V, 4 min; 25V; 2 min) to nitrocellulose membranes using an iBlot 2 Dry Blotting System (Invitrogen). Membranes were briefly washed in PBS-T (0.1% Tween-20/PBS), blocked (5% skimmed milk/PBS-T, 1 hour) and incubated with primary antibodies (1% BSA/PBS-T, overnight at 4°C) at the following dilutions: mouse anti-SAG1 (1:1000, Thermo Scientific) and rat anti-HA (1:1000, company, Roche). Following washing (PBS-T, 3×), membranes were incubated with HRP-conjugated secondary antibodies (1:5000, Thermo Scientific) in 1% BSA/PBS-T for 1 hour at room temperature. Protein bands were developed using the ECL™ Western Blotting Detection Reagent (GE Healthcare) and chemiluminescence was visualized using a ChemiDoc MP Imaging System (Bio-Rad).
Confluent HFF monolayers grown on glass coverslips were seeded with ~100,000 parasites. Approximately 24 hours post-infection, cells were fixed (4% paraformaldehyde for 15 min at room temperature) permeabilized (0.1% Triton X-100/PBS for 5-10 min) and blocked (3% BSA/PBS for 1 hour at room temperature). Staining was performed for 1 hour with primary antibodies at the following dilutions: rat mouse anti-SAG1 (1:1000, Thermo Scientific), rabbit anti-HA (1:1000, company – check with Fabio) and X anti-Ty1 (1:1000, Baum Lab). Labelled proteins were stained for 1 hour at room temperature using Alexa Fluor 488/594-conjugated goat antibodies (1:2000, Life Technologies). Nuclei were stained using the intercalating DNA dye DAPI at 5 μg/ml. Stained coverslips were mounted onto glass slides using VECTASHIELD® Antifade Mounting Media (Vector Labs) and imaged on a Nikon Ti-E inverted microscope. Images were acquired using an ORCA-Flash 4.0 camera and processed using ImageJ software.
Confluent HFF monolayers grown in 6-well plates were seeded with 200-400 parasites. Parasites were allowed to invade overnight prior to treatment with 50 nM rapamycin or DMSO for 4 hours. Following replacement to standard culture medium, plaques were left to form undisturbed for 6-7 days. Monolayers were then fixed with ice-cold methanol for 10 min and stained with crystal violet stain (2.3% crystal violet, 0.1% ammonium oxalate, 20% ethanol) for 2 hours. Plaques were enumerated manually, and statistical significance in plaque counts between rapamycin and DMSO-treated samples were tested using two-tailed unpaired Student’s t-tests with unequal variance. The data are presented as mean (±SD) counts.
Design and optimisation of the CORe platform
The design of the CORe workflow begins with the identification and selection of paired CRISPR guide RNA (gRNA) sequences that target the Cas9 nuclease to sites 5’ and 3’ of a target cysteine codon. As demonstrated in Caenorhabditis elegans11, we reasoned that a dual gRNA strategy would provide positive selection towards HDR-mediated integration of mutational templates for our essential gene subset, as the lack of repair of two double-strand breaks (DSBs) in an essential gene should be refractory to growth. To test this hypothesis, the frequency of mutants following mutagenesis of an N-terminal proline codon in surface antigen gene1 (SAG1) was compared using single or dual gRNAs in combination with single- or double-stranded strand donor repair templates (Fig. S7a). These experiments revealed that dual gRNAs in combination with double-stranded templates provided the highest integration efficiency in the absence of any selectable marker. As anticipated in the absence of drug selection, the frequency of mutants was low (Fig S7b). The potential negative impact of this upon quantitation of integration events was circumvented through the inclusion of recodonized sequence within the donor template. This allowed for integration-selective priming and therefore generation of PCR amplicons of modified genomic loci for downstream NGS analyses (Fig. 2a, S7c). The protein-centric CRISPR guide design tool, CRISPR-TAPE12, was used to simplify and accelerate the gRNA identification and selection process for target cysteines. Accommodating the need for high-throughput multiplexed vector construction, BASIC7 was adapted to our sequences and used for facile, modular and scalable production of all transfection vectors, with dual gRNA cassettes and Cas9 encoded on the same vector as previously reported (Fig. S3a)8,13. The RHΔku80 NHEJ-deficient parasite strain was used to further promote HDR14.
Donor repair templates were designed to 1) destroy the protospacer adjacent motif (PAM) and/or gRNA seed sequence required for Cas9 targeting and so prevent further modification of the site following integration; 2) provide a recodonized stretch of sequence proximal to the target cysteine for the generation of integration-specific amplicons at mutated sites. Transfection with the dual gRNA vector introduces DSBs 5’ and 3’ of the target cysteine. The excised locus is subsequently repaired using one of the donor templates, producing a mixed mutant pool, which is sampled shortly after transfection for subsequent genomic DNA extraction (‘Pre’ sample) (Fig. 2a). For each reactive cysteine candidate, T. gondii tachyzoites are co-transfected with a single cysteine-targeting dual gRNA plasmid and all five donor templates for HDR (Fig. 2a). The repair templates encoded for either a WT synonymous replacement of the target cysteine, a stop codon, or one of the three amino acid substitution options.
Following transfection, the mixed population of mutants grow competitively, and are sampled for genomic extraction (‘Post’ sample) (Fig. 2a). Where the DSB is repaired using the synonymous WT template, parasites are expected to grow normally. In instances where the stop codon template is integrated, the gene coding sequence (CDS) is disrupted, with parasite growth anticipated to be attenuated equivalent to a knockout15. After quantitative deep sequencing of integration-specific amplicons encompassing a target cysteine, the frequency of reads for a given mutant in the Post sample (_f_Post) is normalized to Pre (_f_Pre) to derive fitness scores (Fs) that reflect the viability of parasites during competitive lytic growth. The Fs’ for the amino acid mutants are benchmarked against the synonymous WT and stop codon mutants. This provides a quantitative assessment of the contribution of an individual cysteine to protein function in live cells, using mutant cell fitness as a measurable phenotype and NGS reads as the readout. Multiplexing of CRISPR vector construction with BASIC, 96-well plate-based transfections, and automated an NGS sample preparation workflow enables hundreds of targets to be functionally interrogated in parallel.
CORe plasmid and template library design and construction
Guide RNAs were searched against the T. gondii GT1 genome (release 46; www.toxodb.org) using the ‘position-specific’ function of CRISPR-TAPE (version 1.0.0)12. Briefly, gRNAs binding in near proximity of a target cysteine codon were identified by applying a search distance threshold of ±200 nt. For each codon, two gRNAs binding at sites 5’ and 3’ of the residue were then selected. Selection criteria was based on the number of potential off-target sequences, %GC content and the ability to introduce synonymous PAM or guide blocking mutations at the target genomic sequence. gRNAs were synthesized by Twist as a fragment containing a U6 promoter and flanking BASIC Prefix and Suffix sequences, and independently cloned into _Bsa_I sites of a kanR-pMB1 storage plasmid, pTwist Kan (High Copy). For each target cysteine, the corresponding 5’ and/or 3’-binding gRNA fragment were subcloned into pCORe in a three-part BASIC reaction, replacing the mScarlett counterselection cassette and generating the pCORe-CRISPR plasmid. The sequences of all gRNA fragments are listed in Table S6.
Donor templates for mutation of target cysteines were synthesized as 300 bp double-stranded fragments by Twist. For the SAG1 experiments, 70 bp single-stranded oligonucleotides (P28-P27) were used and hybridized to generate double-stranded templates. For each cysteine codon, five templates were designed to incorporate single unique mutations; a recodonized cysteine codon, alanine, serine, tyrosine or a stop codon. Mutation sites were flanked by regions of synonymous recodonized sequence to (1) enable specific detection of cysteine mutants by PCR, and (2) introduce blocking mutations at the PAM and/or gRNA seed sequence to prevent re-excision of modified genomic loci. Recodonisation was avoided or minimised at intron-exon junctions to avoid interference with mRNA splicing. Homology regions were incorporated on either end of templates to promote genomic integration of mutational templates by HDR. The sequences of all mutational templates are listed in Table S6.
CORe mutagenesis screens
Transfections were carried out in 16-well Nucleocuvette™ strips using the Amaxa 4D-Nucleofector X-Unit (Lonza). For the optimized CORe screen, 7 μg of pCORe-CRISPR and 0.2 μg of each of the five corresponding mutational templates (equivalent to a ~1:5 plasmid-to-template molar ratio) were co-transfected into 1×106 RHΔku80Δhxgprt parasites14. For the SAG1 experiments, 6 μg of pCORe-CRISPR and 2 μg of a single template were transfected (~1:100 plasmid-to-template molar ratio). Transfected parasites were expanded in HFF monolayers grown in 24 well plates and allowed to egress naturally three days after infection. Approximately 2×106 of the egressed parasites were used to infect confluent HFF monolayers in 6 well plates, and the remaining parasites (~2×106) were pelleted and frozen for genomic DNA extraction as the initial ‘Pre’ mutant population. Parasites were allowed to egress naturally five days after infection and similarly harvested as the ‘Post’ mutant population. Parasite genomic DNA from frozen cell pellets was extracted using the DNeasy Blood & Tissue Kit (Qiagen) for downstream NGS library preparation.
Illumina library preparation, sequencing and data analysis
Genomic DNA libraries were prepared similarly to the 16S Metagenomic Sequencing Library Preparation guide (Illumina). Briefly, for each target cysteine, a ~600-800 bp fragment targeting the modified genomic locus was PCR amplified from parasite DNA. For the SAG1 experiments, the amplicons were designed to encompass the template integration site of both modified and unmodified loci. All primers were designed to include overhanging Illumina adapter sequences and are listed in Table S5 (P32-P181). The resulting amplicon was purified using AMPure XP magnetic beads (Beckman Coulter). Dual indices and sequencing adapters were then ligated to the purified products using the Nextera XT Index Kit (Illumina). Indexed amplicons were then purified using AMPure XP beads, and quantified using the Qubit™ dsDNA HS/BR Assay Kits (Invitrogen), or the QuantiFluor ONE dsDNA System (Promega). Indexed amplicons were pooled at equimolar concentration, and the size and purity of the resulting library was assessed on a TapeStation 2200 with the D1000 ScreenTape System (Agilent). The transfer of reagents used for the purification and indexing of amplicons was performed using acoustic liquid handling (Echo 525, Labcyte). Pooled libraries were sequenced using an Illumina NextSeq 500 75PE Mid Output run with a PhiX spike-in of 10%. Following acquisition, sequencing data were demultiplexed using CASAVA 2.17 and analyzed using the Galaxy web server (www.usegalaxy.org). For each uniquely indexed sample, the sequences were concatenated and separated by each template variant to determine the read counts of the different mutation types. The change in frequency of each mutant variant was calculated by normalizing the percent proportion of reads in the Post population sample to the Pre. The differences in normalized read frequency of the nonsynonymous mutations were statistically tested against the recodonized cysteine mutation by one-way analysis of variance (ANOVA).
Cysteine labelling and click chemistry
Cell pellets of T. gondii RHΔku80Δhxgprt parasites were lysed by sonication in PBS (pH 7.4) and soluble fractions separated by centrifugation at 3,500 × g for 5 min. Protein concentrations were determined using the DC Protein Assay Kit (Bio-Rad) and a SpectraMax M2e Microplate Reader (Molecular Devices). Proteome samples diluted to 2 mg/ml were treated with 10 or 100 μM IA-alkyne (from 1 mM and 10 mM stocks in DMSO, respectively) and incubated for 1 hour at room temperature with rotation. The labelled proteins were then subject to click chemistry by addition of 100 μM Azo-L or Azo-H, 1 mM TCEP, 100 μM TBTA, and 1 mM CuSO4 (final concentrations). Click reactions were incubated for 1 hour at room temperature with shaking. The Azo-L/H-labelled protein samples were then precipitated by adding trichloroacetic acid (TCA) to 10% (v/v) concentration. After overnight storage at −80°C, precipitated proteins were pelleted by centrifugation (15,000 rpm, 10 min), washed 3× with chilled MeOH and resolubilized in 1.2% SDS in PBS by gentle sonication and heating (80°C, 10 min).
Enrichment and on-bead digestion
Labelled proteome samples were diluted to 0.2% SDS with PBS. The resulting samples were then added to 100 μl of Pierce™ Streptavidin beaded agarose resin (Thermo Scientific) and incubated overnight at 4°C followed by a further 2 hours at room temperature. Protein-bound beads were washed with 1× 0.2% SDS in PBS, 3× PBS and 3× H2O before resuspending in 6 M urea in PBS +10 mM DTT and incubating at 65°C for 15 min. Reduced samples were then alkylated by adding iodoacetamide to a final concentration of 20 mM and incubating for 30 min at 37°C with rotation. Samples were diluted 3-fold with PBS and centrifuged (1400 × g, 2 min) to pellet the beads. The beads were resuspended in a mixture of 200 μl of 2 M urea in PBS, 1 mM CaCl2 and 2 μg trypsin and incubated overnight at 37°C. The beads were separated from the digest by centrifugation and washed 3× with PBS and 3× H2O. Azo-labelled peptides were then cleaved by adding 50 mM sodium hydrosulfite (Na2S2O4) and rotating at room temperature for 1 hour. Eluted peptides were then collected from the supernatant, and Na2S2O4 cleavage was repeated twice more to fractionate the sample. Between each cleavage, the beads were washed with 2× H2O and combined with the previous elution. Formic acid was added to the sample to 20% (v/v) concentration before storing at −20°C until mass spectrometry analysis.
LC/LC-MS/MS analysis, peptide identification and quantification
LC-MS/MS analysis was performed on an LTQ-Orbitrap Discovery mass spectrometer (Thermo Scientific) coupled to an Agilent 1200 Series HPLC. Azo digests were pressure loaded onto 250 μm fused silica desalting columns packed with 4 cm Aqua C18 reverse phase resin (Phenomenex). Peptides were then eluted onto a biphasic column consisting of 100 μm fused silica packed with 10 cm C18 and 4 cm PartiSphere SCX resin (Whatman) following a five-step multidimensional LC/LC-MS/MS protocol (MudPIT)1. Each step used a salt push (0%, 50%, 80%, 100%, 100%) followed by an elution gradient of 5-100% Buffer B in Buffer A (Buffer A: 95% H2O, 5% MeCN, 0.1% formic acid; Buffer B: 20% H2O, 80% MeCN, 0.1% formic acid) at a flow rate of 250 nl/min. Eluted peptides were injected into the mass spectrometer by electrospray ionization (spray voltage set at 2.75 kV). For every MS1 survey scan (400-1800 m/z), 8 data-dependent scans were run for the nth most intense ions with dynamic exclusion enabled.
The generated tandem MS data were searched using the SEQUEST algorithm16 against the T. gondii database (GT1 proteome), Toxo_DB (http://toxodb.org/). A static modification of +57.02146 on cysteine was specified to account for alkylation with iodoacetamide. Variable modifications of +456.2849 and +462.2987 were further assigned on cysteine to account for the probe modification with the isotopically light (Azo-L) and heavy (Azo-H) variant of the IA-alkyne-Azo adduct, respectively. Output files from SEQUEST were filtered using DTASelect 2.0. Quantification of isotopic light:heavy ratios was performed using the CIMAGE quantification package as previously described17. Overlapping tryptic peptides containing the same labelled cysteine (but different charge states or tryptic termini) were grouped and the median reported as the final light:heavy ratio (_R). R values were averaged across biological replicates and peptides with relative standard deviations of the ≥ 50% R value were removed.
Bioinformatics analysis of reactive cysteine dataset
Functional annotation of reactive cysteine proteins was carried out using BLASTP, Gene Ontology (GO) and InterPro searches within Blast2GO 5 PRO software18. Consensus protein sequences were BLASTP searched against the non-redundant (nr) NCBI protein database using an E-value cut-off of 10−6. GO terms (molecular function, biological process and subcellular localization) were then mapped from the top 20 hits and merged with annotations derived from the InterPro database (www.ebi.ac.uk/interpro). Assignments were further optimized using Annex augmentation19.Enrichment of annotations was assessed using a Fisher’s exact test against the T. gondii proteome (strain GT1; UniProt Taxonomy ID 507601) at < 0.05 FDR.
For conservation analyses of reactive cysteines, orthologues of the associated protein were identified from orthologue groups classified on OrthoMC20. Conservation of a given residue was assessed following BLASTP alignment of the orthologous protein sequence against the T. gondii template sequence. Scores were assigned to each alignment based on the presence or absence of a matched cysteine; a score of 3 was assigned to conserved cysteines, 1 for no conservation, and 0 if no protein was identified in the orthologue group for a given species. Conservation scores were determined for each cysteine by summing of the scores across the analyzed species.
In vitro translation (IVT) assay
Pellets of P. falciparum 3D7 or HEK 293F cells were suspended in 1× pellet volume of lysis buffer supplemented with 20U of human placental RNase inhibitor and cOmplete™ EDTA-free Protease Inhibitor Cocktail (Roche). Resuspended parasites were then transferred to a prechilled nitrogen cavitation chamber (Parr Instrument Company) and incubated on ice at 1500 PSI for 60 min. Following release from the chamber, the crude lysate was clarified by differential centrifugation (15 min at 10,000 × g and 4°C, followed by 15 min at 30,000 × g and 4°C). Protein concentration was determined using a NanoDrop (Thermo Scientific) at 280 nm and adjusted to 12 mg/ml prior to storage at −80°C. Prior to performing in vitro translation assays, low-bind 384-well plates (Corning) were printed (D300e Digital Dispenser, Tecan) with compounds dissolved in DMSO to be assayed at 0.5% of the total assay volume. Five microlitres of P. falciparum clarified lysate was then added to each well, followed by 4.5 μl L-amino acids (each at 200μM in 45 mM HEPES pH 7.45, 100 mM potassium acetate, 1.5 mM magnesium acetate, 2 mM DTT, 20 U human placental RNase inhibitor, 15 μM leupeptin, 1.5 mM ATP, 0.15 mM GTP, 40 U/ml creatine phosphokinase and 4 mM creatine phosphate (Thermo Scientific), 2% (w/w) PEG3000, 1 mM spermidine and 0.5 mM folinic acid) and 0.45 μl of purified red click-beetle luciferase (CBG99) mRNA (1 μg/μl). CBG99 mRNA was transcribed from expression plasmids pH-CBG99-H (for use in P. falciparum assays) or pT7CFECBG99 (HEK 293F assays) as previously described21. Prepared plates were incubated at 32°C for 1 hour 40 min before adding 10 μl of 45 mM HEPES pH 7.45, 1 mM magnesium chloride, 1 mM ATP, 5 mM DTT, 1% (v/v) Triton-X, 10 mg/ml BSA, 1× Reaction Enhancer (Thermo Scientific), 1 mg/ml D-luciferin (Thermo Scientific) and 0.5 mM cycloheximide. Luminescence was measured across each well using a Tecan M200 Infinite Pro microplate reader heated to 37°C.
Protein structures and homology modelling
Solved protein structures were downloaded from the RCSB PDB (www.rcsb.org). Homology models were predicted from primary protein sequences using the Phyre2 web portal22; only models constructed with 100% confidence and ≥ 40% sequence identity across ≥ 70% of the sequence were used. Structural images were generated using PyMOL software (version 2.1.1.; Schrödinger LLC).
Statistical tests were performed using GraphPad Prism 8.0 as described in the individual experimental sections above. P-value significance thresholds were set at: **** = p < 0.0001, *** = p < 0.001, * = p < 0.01 and = p < 0.05. All significant results are annotated with a line and asterisk(s) in the graphs.
Schematics were created using Adobe Illustrator (version 22.1) and Inkscape (version 0.92.3). Chemical structures were drawn in ChemDraw Professional (version 18.0). PyMOL (version 2.1.1) was used to generate 3D protein structures.
Nucleophilic amino acids are important in covalent drug development yet underutilized as antimicrobial targets. Over recent years, several chemoproteomic technologies have been developed to mine chemically-accessible residues via their intrinsic reactivity toward electrophilic probes. However, these approaches cannot discern which reactive sites contribute to protein function and should therefore be prioritized for drug discovery. To address this, we have developed a CRISPR-based Oligo Recombineering (CORe) platform to systematically prioritize reactive amino acids according to their contribution to protein function. Our approach directly couples protein sequence and function with biological fitness. Here, we profile the reactivity of >1,000 cysteines on ~700 proteins in the eukaryotic pathogen Toxoplasma gondii and prioritize functional sites using CORe. We competitively compared the fitness effect of 370 codon switches at 74 cysteines and identify functional sites in a diverse range of proteins. In our proof of concept, CORe performed >800 times faster than a standard genetic workflow. Reactive cysteines decorating the ribosome were found to be critical for parasite growth, with subsequent target-based screening validating the apicomplexan translation machinery as a target for covalent ligand development. CORe is system-agnostic, and supports expedient identification, functional prioritization, and rational targeting of reactive sites in a wide range of organisms and diseases.