Computational protocol: A Lack of Parasitic Reduction in the Obligate Parasitic Green Alga Helicosporidium

[…] KEGG metabolic pathway maps for the green algae Chlamydomonas reinhardtii, Volvox carteri, Ostreococcus tauri and Ostreococcus lucimarinus were retrieved from the KEGG pathway databases , , the proteins sorted accordingly, and then used as queries for homology searches against the Helicosporidium, Chlorella and Coccomyxa proteomic, genomic and transcriptomic datasets (the Chlorella and Coccomyxa data was retrieved from the JGI website). BLASTP and TBLASTN searches were performed using E-value thresholds of 1E-10 and 1E-05, respectively. Genes not found in searches against any of the three datasets were considered absent from the corresponding organism. Network analyses were performed according to . Specifically, all possible edges were drawn between pairs of genes if their reciprocal BLASTP comparisons to one another met all of the following conditions: E-value<1E-10, minimal hit identity >20, at least 20% of the shortest gene's length had identical residues in the match, and the hit length >20 amino acids. The network was then filtered to include underrepresented Helicosporidium genes compared to Coccomyxa and Chlorella. Functional annotations for the genes comprising each connected component (GenBank, KOG, KEGG, Interpro , and Pfam) were used to characterize each connected component by its inferred biological function. Plastid-targeted proteins from GreenCut2's were extracted from the corresponding Chlamydomonas reinhardtii (version 3.1) and Arabidopsis thaliana ( protein catalogs and converted to custom BLAST databases with MAKEBLASTDB from the NCBI BLAST package. Helicosporidium, Chlorella and Coccomyxa were searched independently against both GreenCut2 databases with BLASTP (proteins) and TBLASTN (genome and transcriptome) using E-value thresholds of 1E-10 and 1E-05, respectively. [...] Putative glycosyl hydrolases identified in the Helicosporidium genome were annotated for catalytic and chitin binding domains using SMART 7 and endo-proteolytic sites often located within developmental insect chitinases were identified with ePESTfind . The glycosyl hydrolase catalytic domains were annotated manually for the presence and orientation of key amino acid motifs. Secretory signal motifs were searched for with TargetP 1.1 and PredAlgo . [...] Amino acid sequences retrieved from GenBank were aligned with the L-INS-I algorithm from MAFFT 7.029b . Phylogenetic models were selected with ProtTest 3.2 . Maximum Likelihood phylogenetic reconstructions were performed with PHYML 3.0 under the LG+Γ4+I model of amino acid substitution . […]

Pipeline specifications

Software tools BLASTP, TBLASTN, InterPro, BLASTN, EMBOSS, TargetP, PredAlgo, MAFFT, ProtTest, PhyML
Diseases Parasitic Diseases, Substance-Related Disorders