[…] er ml of the analyzed sample, which is typical for surface ocean (), and accounting for the 10 × dilution of the sample prior to cell sorting, we estimate the frequency of free viral particles being in cells' shade to be less than 1 in 2500. This indicates high probability that viruses detected in our study sample of 58 cells were either inside or attached to the analyzed cells during cell sorting., Because multiple displacement amplification (MDA) amplifies only dsDNA and ssDNA, our study did not target RNA viruses. To identify SAG contigs originating from DNA viruses, we used a combination of five criteria, listed below., Gene prediction of sequenced and assembled SAGs was performed using Prodigal (). The translated protein sequences were then used as queries in BLASTp () searches (e-value <0.001, max. 10 hits) of the GenBank nr database (Updated 12 July 2013). We identified homologous sequences containing words within the sequence description indicative of viral genes (*phage, *virus, virion, prophage, terminase, capsid, head, tail, fiber, baseplate, portal, lysis, structural, T4, lambda, mu, lambdoid, podo*, myovir*, siphovir*, integrase, transposase). Query sequences homologous to hypothetical proteins were also identified, because viral genomes are generally enriched in them. We also searched for tRNA genes, which are common sites for prophage integration into host genomes, using tRNAscan ()., In bacterial genomes, a GC skew is associated with the origin of replication (). GC skew can also be associated with the insertion of foreign DNA, including prophages (). Local anomalies in GC content and codon usage may also aid in the detection of prophages and other laterally acquired genetic elements within bacterial genomes (; ). For each contig, we calculated GC content and GC skew with custom scripts using a sliding window of 1600 bp. Tetramer frequencies have been used to detect contaminating sequences in SAG assemblies (; ; ). Here, we extracted tetramer frequencies using a sliding window of 1600 bp and 200 bp step size to have a minimum of three windows for each conti […]

Pipeline specifications

Software tools Prodigal, BLASTP, tRNAscan-SE