Computational protocol: Diversity and evolution of phycobilisomes in marine Synechococcus spp.: a comparative genomics study

Similar protocols

Protocol publication

[…] Eleven genomes of marine Synechococcus spp. were used for this study: The WH8102 (NC_005070), CC9902 (NC_007513) and CC9605 (NC_007516) genomes have been sequenced by the Joint Genome Institute, the CC9311 (NC_008319) genome by The Institute for Genome Research (TIGR), the WH7803 (NC_009481) and RCC307 (NC_009482) genomes by Genoscope (Evry, France) at the request of a consortium of European scientists coordinated by F Partensky, the RS9916 (NZ_AAUA00000000), RS9917 (NZ_AANP00000000), BL107 (NZ_AATZ00000000), WH7805 (NZ_AAOK00000000) and WH5701 (NZ_AANO00000000) genomes by the J Craig Venter Institute in the framework of the Gordon and Betty Moore Foundation Marine Microbial Genome Sequencing Project at the request of an international consortium coordinated by DJ Scanlan.Gene families from the 11 marine Synechococcus were delineated using BLAST [] with an e-value of 10-12 and the TribeMCL algorithm []. Families of orthologous genes either located in the PBS region and/or involved in PBS biosynthesis or regulation were extracted and manually annotated. Non-modeled genes, missed by ORF finding software, were added to the dataset. The corresponding protein sequences were aligned using ClustalW [] with default parameters and their amino terminus was corrected (that is, extended or shortened) if needed. [...] Phylogenetic analyses were performed using a variable number of concatenated protein sequences depending on each phycobiliprotein type (see results), allowing the use of longer sequences to reduce the variance in the distance estimates []. These sequences were automatically aligned using ClustalW []. Alignments were then manually refined and all gaps and highly variable regions (if any) were removed. Phylogenetic trees were generated using three different reconstruction methods: NJ (with PHYLO_WIN []), ML (with PHYML v2.4.4 []) and MP (with PHYLO_WIN). ML analyses were performed using the Jones Taylor Thornton model and the variability of substitution rates across sites and invariables sites was estimated. Bootstrap values (1,000 replicates) were calculated for all three methods in order to estimate the relative confidence in monophyletic groups and they were all reported on the ML tree used as a reference. Phylogenetic trees were edited using the MEGA4 software []. […]

Pipeline specifications

Software tools Clustal W, PhyML, MEGA
Application Phylogenetics