Computational protocol: Bioinformatic analysis of beta carbonic anhydrase sequences from protozoans and metazoans

Similar protocols

Protocol publication

[…] Identification of novel β-CAs was based on the presence of the highly-conserved amino acid sequence patterns of the active site, namely Cys-Xaa-Asp-Xaa-Arg and His-Xaa-Xaa-Cys also marked in Additional file : Figure S1. Alignment was visualized in Jalview []. In total, 75 invertebrate β-CA sequences were retrieved from Uniprot (http://www.uniprot.org/) for alignment analysis, and one bacterial sequence (Pelosinus fermentans) was included as an outgroup. All protein sequences were aligned using Clustal Omega (http://www.ebi.ac.uk/Tools/msa/clustalo/) []. The sequences were manually curated to remove residues associated with an incorrect starting methionine. A total of 90 residues were removed from the N-terminal end of Uniprot IDs D4NWE5_ADIVA, G0QPN9_ICHMG, D6WK56_TRICA, I7LWM1_TETTS and I7M0M0_TETTS. The modified protein sequences were then re-aligned. This protein alignment then served as the template for codon alignment of corresponding nucleotide sequences using the Pal2Nal program (http://www.bork.embl.de/pal2nal/) []. [...] The phylogenetic analysis was computed using Mr. Bayes v3.2 []. After 8 million generations using the GTR codon substitution model, with all other parameters as default, the standard deviation of split frequencies was 1.39 × 10-3. The final output tree was produced using 50% majority rule consensus. FigTree v1.4.0 (http://tree.bio.ed.ac.uk/software/figtree/) [] was used to visualize the phylogenetic tree and the Pelosinus fermentans[] sequence set as outgroup. Additional trees were constructed for comparison using maximum likelihood (PhyML)[], UPGMA, and neighbor-joining methods within Geneious version 7.0.5 from Biomatters (Auckland, New Zealand) (http://www.geneious.com/). [...] Subcellular localization prediction of each identified invertebrate β-CA was performed using the TargetP webserver (http://www.cbs.dtu.dk/services/TargetP/). TargetP is built from two layers of neural networks, where the first layer contains one dedicated network for each type of pre-sequence [cTP (cytoplasmic targeting peptide), mTP (mitochondrial targeting peptide, or SP (secretory signal peptide)], and the second is an integrating network that outputs the actual prediction (cTP, mTP, SP, other). It is able to discriminate between cTPs, mTPs, and SPs with sensitivities and specificities higher than what has been obtained with other available subcellular localization predictors []. […]

Pipeline specifications

Software tools Jalview, Clustal Omega, PAL2NAL, FigTree, PhyML, Geneious, TargetP
Applications Phylogenetics, Protein sequence analysis, Nucleotide sequence alignment
Organisms Saccharomyces cerevisiae, Caenorhabditis elegans
Diseases Infection
Chemicals Amino Acids