MATERIALS AND METHODSModel system: overview of the experimental design, sampling scheme, and sequence analysis pipeline. Streptococcus thermophilus strain DGCC7710 and co-culture of DGCC7710 with phage 2972 were grown, incubated, transferred, and stored as described in reference 13. We used the procedures from this study for the determination of CFU and PFU and detection of phage 2766. The Streptococcus thermophilus host was cultured over time (MOI-0 series) or challenged with lytic phage 2972 at two different multiplicities of infection (host:phage ratios of 1:2 for the MOI-2 series MOI-2A and MOI-2B and 1:10 for the MOI-10 series) (Fig. 1). Then, samples were transferred daily as a 1% (vol/vol) inoculum. Host counts (in CFU per milliliter) and phage counts (in PFU per milliliter) were monitored over time, and samples obtained at various time points were subjected to DNA extraction (50-ng minimum) and subsequently used as the templates for library preparation and deep sequencing using 100-bp paired-ends sequencing with the Illumina system. (Selection of the time points was based on dynamically interesting growth curve fluctuations of culture/co-culture experiments distributed across time and covering a wide range of host:virus ratios.)Metagenomic sequencing. DNA samples were subsequently used as the templates for library preparation and deep sequencing by using 100-bp paired-ends sequencing with the Illumina system. A total of 165 Gb of sequence data were generated and subjected to bioinformatic analyses, including host CRISPR locus spacer detection, host and phage genome assembly, and comparison with wild-type sequences for the host chromosome, phage 2972, and phage 2766 sequences. Finally, postassembly as well as comparative analyses were performed to identify SNPs, indels, and recombination events.Computational analyses. Total genomic DNA (bacterium and phage) extracted at various time points from all experimental series was sequenced using Illumina high-throughput technology at the WM Keck Center for Comparative and Functional Genomics (University of Illinois at Urbana---Champaign) (for details, see Table S1 in the supplemental material). Reads were quality filtered by trimming both ends, using sickle (a program available at the GitHub, a Web-based Git repository hosting service) and deposited the sequence information with NCBI (see “Nucleotide sequence accession numbers,” below). Only paired reads were used in the assemblies. Assemblies were evaluated using idba_ud (27) and default parameters.Phage genomes were reconstructed from each sample separately as follows. First, all reads that belonged to the genome of S. thermophilus DGCC7710 were removed based on alignment to the reference genome using Bowtie (28). Remaining reads were assembled using Velvet (29) with parameters were adjusted to the expected coverage, considering the number of reads and the length of phage 2972. Only a subset of the reads was used when the expected coverage exceeded ~200×. For phage genome analysis, we used miniassembly procedures described previously (30).SNPs were identified by using a program that takes read mapping information as input and computes base calling for every position on the target genome. C++ source code for the program is available at https://github.com/CK7/SNPs.Spacer sequences were extracted using a custom Ruby script (included in SOM) that searched for each of the exact repeat sequences from each of the CRISPR loci, as well as their reverse complement sequences in the full read set. We grouped the spacers that shared 85% in length and 85% sequence identity to avoid overrepresentation of spacer types due to sequencing errors.Nucleotide sequence accession numbers. The sequences identified in this work have been deposited with NCBI under Bioproject number PRJNA275232 and SRA main accession number SRP055779.
Many bacteria rely on CRISPR-Cas systems to provide adaptive immunity against phages, predation by which can shape the ecology and functioning of microbial communities. To characterize the impact of CRISPR immunization on phage genome evolution, we performed long-term bacterium-phage (Streptococcus thermophilus-phage 2972) coevolution experiments. We found that in this species, CRISPR immunity drives fixation of single nucleotide polymorphisms that accumulate exclusively in phage genome regions targeted by CRISPR. Mutation rates in phage genomes highly exceed those of the host. The presence of multiple phages increased phage persistence by enabling recombination-based formation of chimeric phage genomes in which sequences heavily targeted by CRISPR were replaced. Collectively, our results establish CRISPR-Cas adaptive immunity as a key driver of phage genome evolution under the conditions studied and highlight the importance of multiple coexisting phages for persistence in natural systems.