Computational protocol: Gene Network Visualization and Quantitative Synteny Analysis of more than 300 Marine T4-Like Phage Scaffolds from the GOS Metagenome

[…] Content was manually determined for each scaffold using the basic local alignment search tool (BLAST) at the National Center for Biotechnology Information (NCBI) Web site ( with an E value cutoff of <10−4 against the nr database, using the top hit as protein identity, with occasional restriction to viral genes for a more informative origin/function determination. We used BlastX (bacterial genetic code) directly on scaffolds of small size (generally <8 kb). Otherwise, GeneMark Heuristic approach (; ) was used for open reading-frame (ORF) determination, followed by BlastP of the resulting ORF translations. The few sequences with obvious frameshifts (primarily single base pair–sequencing errors) were corrected, whereas more ambiguous gene splits were left unchanged. Gp20 portal proteins from the Cyano-T4 genomes (NCBI) were queried against all GOS proteins through the CAMERA database ( with an E value cutoff of <10−4. The Cyano-T4 genome coverage circular visualizations were generated using CGView (). [...] The conserved marker proteins (gp20, gp23, gp43) of the Cyano-T4s (and T4 as the outgroup) were aligned using ClustalW () within BioEdit v7.0.9.0 ( These alignments were used to construct Neighbor-Joining phylogenies using QuickTree (), which uses the ClustalW distance calculations using default parameters (neither column rejection nor multiple substitution correction), with 1 000 bootstrap replicates as implemented on the Institut Pasteur Mobyle web portal ( […]

