For each subject evaluated, a database of spacer groups
was generated, and databases were compared to determine shared spacer groups and to create heatmaps using Java Treeview [43]. Spacer heat matrices were created using Microsoft Excel 2007 (Microsoft Corp., Redman, WA). Beta diversity was determined using binary Sorensen distances, and was used as input for principal coordinates analysis using Qiime [44]. Spacers from each subject were Selleck INK1197 subjected to BLASTN [34] analysis based on the NCBI Non-redundant database. Hits were considered significant based on bit scores ≥45, which roughly correlates to 2 nucleotide differences over the length of a 30 nucleotide spacer. The number of blast homologues then were normalized for each subject, and heatmaps were created using Java Treeview [43]. Spacers also were queried against www.selleckchem.com/products/Vorinostat-saha.html the loci present in the CRISPR Database [38] or other specified metagenomic datasets, and only spacers that were identical or had a single mismatch over the entire length of the spacer were considered matches. CRISPR spacers for each subject were used to search a database of the virome reads for matches from all viromes combined, and the number of spacer matches per virome read was used to create
heatmaps. The heatmaps were normalized by the total number of spacer matches per virome read, and were generated using Java Treeview [43]. Rarefaction analysis was performed based on spacer group richness estimates of 10,000 iterations using EcoSim [45]. CRISPR loci were reassembled from reads that had a minimum of 2 full spacer sequences flanked by
full-length repeat motifs. Each locus was reassembled based on matching adjacent spacers, in which reads were only assembled into loci if their adjacent spacers were present in the same combination Phloretin in at least 75% of the reads assessed. Isolation and analysis of viromes Saliva from human subjects was filtered sequentially through 0.45 μ and 0.2 μ filters to remove cellular debris, and the remaining fraction purified on a cesium chloride gradient as previously described [8]. Only the fraction at the density of most known viruses [46] was retained; it was then further purified on Amicon YM-100 protein purification columns (Millipore, Inc., Bellerica, MA), and treated with DNASE I, followed by lysis and DNA purification using Qiagen UltraSens virus kit (Qiagen, Valencia, CA). Resulting DNA was amplified using Capmatinib nmr GenomiPhi V2 MDA amplification (GE Healthcare, Pittsburgh, PA), fragmented to roughly 100 to 200 bp using a Bioruptor (Diagenode, Denville, NJ), constructed into libraries using the Ion Plus Fragment Library Kit according to manufacturer’s instructions, and sequenced using 316 chips on an Ion Torrent PGM (Life Technologies, Grand Island, NY) [36] producing an average read length of approximately 100 bp for each sample. Each read was trimmed according to modified Phred scores of 0.5 using CLC Genomics Workbench 4.