From the growth temperature database TEMPURA and the Genome Taxonomy Database,22, 23 we retrieved 682 bacteria and 156 archaea with growth temperatures and phylogenetic information. The sequences of these genomes were downloaded from ftp://ftp.ncbi.nlm.nih.gov/genomes/. We annotated their CRISPR-Cas systems using CRISPRCasFinder v1.3 program.24 The raw data of these 682 bacteria and 156 archaea were deposited as Table S1.
According to Couvin et al.,24 the annotated CRISPR arrays were classified into four categories, 1 to 4, according to their evidence levels. The CRISPR arrays with evidence levels 3 and 4 are highly likely candidates, and those with evidence levels 1 and 2 are potentially invalid. Therefore, only the CRISPR arrays with evidence levels 3 and 4 were counted in calculating CRISPR array abundance. We counted the putative CRISPR arrays with evidence levels 1 and 2 as zero and presented the results in the main text and Table S2. We also replicated our analyses by regarding the putative CRISPR arrays with evidence levels 1 and 2 as controversial CRISPR arrays and discarding the species having only CRISPR arrays with evidence levels 1 and 2 in calculating CRISPR array abundance. That is, these species were excluded from both numerator and denominator in the calculation of CRISPR-Cas abundance. Similar results (Table S3) were obtained, and the same conclusion was supported.
The CRISPR array abundance in a bacterial or archaeal group was defined as the total number of CRISPR arrays annotated in their genomes divided by the genome numbers of the group. The abundances of CRISPR spacers, cas genes, and cas gene clusters were defined similarly.
The phylogenetic signals (λ) of CRISPR array abundance, CRISPR spacer abundance, cas gene abundance, cas gene cluster abundance, and growth temperatures were estimated using the phylosig function of the R (Version 4.0.3) package phytools (Version 0.7-70)25. The phylogenetic generalized least squares (PGLS) regression was performed using the R (Version 4.0.3) package phylolm (version 2.6.2)26. Pagel’s lambda model has been applied in the analyses.
Article TitleBacterial CRISPR-Cas Abundance Increases Precipitously at Around 45°C: Linking Antivirus Immunity to Grazing Risk
Although performing adaptive immunity, CRISPR-Cas systems are present in only 40% of bacterial genomes. Here, we observed an abrupt transition of bacterial CRISPR-Cas abundance at around 45°C. Phylogenetic comparative analyses confirmed that the abundance correlates with growth temperature only at the temperature range around 45°C. Meanwhile, we noticed that the diversities of cellular predators have a precipitous decline at this temperature range. The grazing risk faced by bacteria reduces substantially at around 45°C and almost disappears above 60°C. So viral lysis would become the dominating factor of bacterial mortality, and antivirus immunity has a higher priority. In temperature ranges where the abundance of cellular predators does not change with temperature, temperatures would have negligible effects on CRISPR-Cas abundance. The hypothesis predicts that bacteria should also be rich in CRISPR-Cas systems if they live in other extreme conditions inaccessible to grazing predators.