Use of genomic information for bioinformatic analysis and identification of CRISPR-Cas systems in eight Fusobacterium strains
Fusobacterium genomes (F. nucleatum subsp. nucleatum ATCC 23726 (GCA003019785.1), _F. nucleatum subsp. nucleatum ATCC 25586 (GCA003019295.1), _F. necrophorum subsp. funduliforme 11_36S (GCA_003019715.1), _F. varium 27725 (GCA003019655.1), _F. ulcerans 49185 (GCA003019675.1), _F. mortiferum 9817 (GCA003019315.1), _F. gonidiaformans 25563 (GCA003019695.1), and _F. periodonticum 2_1_31 (GCA_003019755.1) were used to extract all sequences to analyze using the CRISPROne web server 13 and CRISPRCasFinder 14. CRISPROne and CRISPRCasFinder were used to predict all CRISPR-Cas associated elements as well as the repeat arrays.
Bioinformatic analysis of CRISPR-Cas Class 2 systems and identification of effector domain
The genome sequence from F. necrophorum subsp. funduliforme 1_1_36S (GCA_003019715.1) was used to predict the open reading frame for the CRISPR-Cas systems. An open reading frame of 1337 amino acids for Cas9 and 1121 amino acids for Cas13c was identified using the web server pHMMER 15 and stand-alone HMMER software package, respectively.
Phylogenetic Tree construction
Full CRISPR-Cas phylogenetic analysis was created using the micropan 16 plugin in Rstudio by using only CRISPR associated proteins to build the tree. The gene maps in Figure 1B were created using the gggenes 17 plugin in Rstudio. Additional phylogenetic trees for Cas9 and Cas13c were created in Geneious 9.18 utilizing the multiple sequence alignment from SMARTBLAST and adapted in Affinity Designer.
The structure prediction of Cas9 from Fusobacterium necrophorum subsp. necrophorum 1_1_36S was performed in the web-based suite Phyre2 18. CRISPR RNA structure prediction was done using NUPACK 19.
Fusobacterium nucleatum has recently received significant attention for its strong connection with the acceleration and gravity of multiple cancers (e.g., colorectal, pancreatic, esophageal). However, our understanding of the molecular mechanisms that drive infection by this ‘oncomicrobe’ have been hindered by a lack of universal genetic tools. Herein we report a global bioinformatic identification and characterization of multiple Fusobacterium CRISPR-Cas adaptive immune systems including Cas13c, and detailed report of the proteins, spacer/repeat loci, trans- activating CRISPR RNA (tracrRNA), and CRISPR RNA (cRNA) from a Type II-A CRISPR-Cas9 system. Since most Fusobacterium are currently genetically intractable, this CRISPR-Cas bioinformatic roadmap could be used to build new genome editing and transcriptional tuning tools to characterize an increasingly important genus of human opportunistic-pathogens connected to the onset, progression, and severity of cancer.
Competing Interest Statement
The authors have declared no competing interest.