Computational protocol: Positive selection drives the evolution of endocrine regulatory bone morphogenetic protein system in mammals

Similar protocols

Protocol publication

[…] The nucleotide and amino acid sequences of BMP genes retrieved from GenBank ( () and accomplished sequences of proteins were aligned using the MUSCLE [], implemented in MEGA6.0 program using the amino acid sequences and back translated to nucleotide for selection analysis []. [...] In order to identify codons under positive selection, only BMPs and GDF9 that were represented in at least 20 species were assessed, as conferred by poon and collaborators []. Hence, BMP6 and BMP7 were excluded from analyses as the multiple sequence alignment generated were not reliable and prone to affect the recognition of selection, leading false positive results []. Phylogenetic analysis was performed on accepted mammalian phylogeny [] by generating un-rooted tree of aligned species. Branch lengths were calculated using tree topology using the codon model in PAML package. The different ω ratios (dN/dS) were compared to identify selection pressure in particular codons using maximum-likelihood methods implemented in the MEGA6 [] and PAML version 4 [].We compared different likelihood ratio tests. The M7 (null model) assumes β distribution with ω in limited (0 and 1) interval. The M8 is an alternative model that includes two parameters (ω and beta), so ω value achieved from the data were greater than one. Additionally, to find out amino acid exposed to selection were inferred using Bayes theorem by estimating posterior probabilities for each site [, ]. Three dimensional structure prediction of BMPs and GDF9 was carried out by using Ab-initio modeling approach []. Primary sequences of BMP2 (ACV32596.1), BMP4 (AAH20546.1), BMP15 (AAI17265.1) and GDF9 (AAH96229.1) were subjected to I-TESSAR [] to predict suitable structures. Structure validation of all predicted models was done by MolProbity server []. To test the steric hindrance of amino acid residues Ramachandran values were calculted by using Ramachandran Plot2.0 tool []. UCSF Chimera [] was applied for visualization and geometry optimization of predicted proteins. The ConSurf server was used to predict the level of evolutionary conservation amino acid sites in protein based on phylogenetic linkage among sequences []. For a more traditional approach, and as used previously [], positive selection sites detected in more than one maximum likelihood approach were considered. We found that the statistical approaches used in this study are able to determine positive selection, but cannot deliver information about positive selection mechanism. Therefore, to printout the location of positively selected amino acid residues might be helpful for additional laboratory examination.For these coding sites subjected to positive selection, we used the COSMIC (Catalogue of Somatic Mutations in Cancer) v82 (released 03-AUG-17) database for exploring the impact of these somatic substitution mutations in human cancer []. The COSMIC database includes hundreds of thousands of human cancer-associated somatic mutations that are classified by tumor type and disease. The prediction of pathogenic point amino acid substitutions mutation was estimated from the usage of Functional Analysis through Hidden Markov Model (FATHMM) []. […]

Pipeline specifications

Software tools MUSCLE, MEGA, PAML, MolProbity, UCSF Chimera, ConSurf, FATHMM
Databases COSMIC
Applications Phylogenetics, Protein structure analysis, Nucleotide sequence alignment
Organisms Homo sapiens
Diseases Neoplasms, Prostatic Neoplasms
Chemicals Amino Acids, Bone Morphogenetic Protein 15