Computational protocol: Independent Evolution of Strychnine Recognition by Bitter Taste Receptor Subtypes

Similar protocols

Protocol publication

[…] The human T2R10 and T2R46 sequences (UniProtKBs: Q9NYW0 and P59540, respectively) were used as queries to retrieve similar T2R sequences from different species via NCBI PSI-BLAST. Two iterations were run for each query against the non-redundant protein sequences database (retrieved during January–June, 2017). Given the questionable consistency of protein nomenclature across orthologs and variation in the total number of bitter taste receptors between species, all sequences with 55% and above amino acid sequence identity (E-values after the 1st iteration: 6e-88 for hT2R10 and 5e-84 for hT2R46) to the query sequences were initially retrieved; fragmental sequences with 265 residues or less were removed to avoid excessive gap-opening in ancestral sequence reconstruction. This resulted in a total of 87 reference sequences (RS) (17 sequences for hT2R10 and 70 sequences for hT2R46, available in FASTA format in Supplementary ).The T2R10 clade was then supplemented with the following nine T2R10s obtained via PCR from a recently published study: wolf (Clu), maned wolf (Cbr), African hunting dog (Lpi), cheetah (Aju), leopard, Tibetan fox (Vfe), Corsac fox (Vco), red fox (Vvu), and Fennec fox (Vze) (Shang et al., ; Supplementary ). All of these nine T2Rs have over 74% protein sequence identity with the hT2R10. From the same study, five T2R43 sequences showed 60% full protein sequence identity with hT2R46. However, due to their distant relationship with hT2R46 in the phylogenetic tree (Supplementary ) with the majority of the key residues for strychnine recognition altered, these sequences were not added to the final hT2R46 clade.For the T2R46 clade, the initial phylogeny (details in the next section) constructed with the sequences obtained through PSI-BLAST showed only four sequences within the subgroup of hT2R46 among the 70 sequences from more than 10 primates that were sampled. Although diversification could result in lower conservation, this could also be attributed to the deactivation of T2R46 in certain primates under lineage-specific evolutionary constraints (Go et al., ; Risso et al., ). We searched via BLASTn for pseudogene nucleotide sequences most similar (over 80% identity) to the hT2R46 gene. We also looked for human T2R pseudogenes (hT2R12p, 15p, 18p, 63p, 64p, and 67p) that are located on the same chromosome as hT2R10 and hT2R46 (12p13.2) in the HGNC database (retrieved in April, 2017: http://www.genenames.org/cgi-bin/genefamilies/set/1162), and added them to the T2R46 clade. Eleven pseudogenes were added after functional restoration and translation into protein sequences (Supplementary , “Restored Pseudogenes” & Supplementary “Pseudogene restoration”) (Martin et al., ). The resulting tree (Supplementary ; see details concerning tree construction below) showed that these pseudogenes were not close homologs of hT2R46 per se but of other hT2Rs which have high sequence identity with hT2R46, such as hT2R48 and hT2R49. The results presented in the main text are highly similar to those obtained after addition of pseudogenes except for a few variations in the predicted common ancestors with varied confidence level (e.g., A2687.42 as discussed before) (phylogenetic tree with predicted ancestral sequences using the 11 pseudogenes are listed in Supplementary ).Lastly, a set of pre-aligned human T2Rs were supplemented to the data set described above. The alignment was manually adjusted based on existing data from mutagenesis (Brockhoff et al., ; Meyerhof et al., ; Born et al., ), and the Ballesteros–Weinstein numbers were assigned accordingly (detailed in “Key positions analysis”). We began by using all 25 hT2Rs, then reduced to 9 hT2Rs including the reference sequences (hT2R43-50 and hT2R10), leaving only those that are at least 70% identical with either hT2R10 or hT2R46, as the sequence identities of the remaining hT2Rs are lower than 45% (Supplementary , “Aligned hT2Rs”).In total, 105 hT2R10- or hT2R46-related reference sequences were retrieved, of which 27 were hT2R10-related and 78 hT2R46-related. The T2R10 and T2R46 nucleotide sequences from the Neanderthal genome database (http://neandertal.ensemblgenomes.org/index.html) were also retrieved but not used for phylogenetic tree construction due to over 99.99% identity with hT2R10 and hT2R46, respectively.Different outgroups (OGs) were sampled to assess their effects on the tree topology and predicted ancestral sequences. As a significant component of phylogenetic analysis, OG selection can influence branch order and length, clade monophyly, as well as divergence rates (Puslednik and Serb, ). Increased OG sampling has been shown to improve the stability of tree topology (Nixon and Carpenter, ). The V1R proteins (member of the type 1 vomeronasal pheromone receptor gene family) commonly serve as OGs in previous T2Rs related studies (Meyerhof and Korsching, ; Li and Zhang, ). We selected the V1Rs from apes (Ptr, Mmu), mouse (Mus musculus, Mmus), rat (Rattus norvegicus, Rno), frog (Xtr), alligator (Alligator mississipiensis, Ami), and fish (Danio rerio, Dre) as well as the T2Rs from frog, alligator, and fish, which are no more similar to the query sequences than the V1Rs (see Supplementary , “Outgroup identity with query sequences”). The resulting ingroup monophyly is fairly consistent across different outgroup samples. However, variations abound on the protein sequence level among the inferred ancestors with different IG and OG samplings (Supplementary , tab “Summary”). [...] The 116 protein sequences obtained from the previous steps were first aligned by MAFFT server (Katoh et al., ), followed by a quality assessment of the resulting MSA using the GUIDANCE2 server (Sela et al., ). The final alignment was calibrated to preserve the consistency of pre-assigned BW numbers from previous studies (Brockhoff et al., ; Meyerhof et al., ; Born et al., ). The evolutionary tree was constructed using Cipres' RaxML (Randomized Axelerated Maximum Likelihood) version 8, HPC BlackBox (Stamatakis, ) (see Supplementary for MSA and phylogeny). We then inferred what ancestral sequences these receptors might have had using the FastML server (Ashkenazy et al., ). All the parameters used in these steps were set to default. [...] To add a temporal dimension to the reconstructed phylogenetic tree and thus infer the precedence of certain T2Rs to others, the relative times of divergence for the branching points were estimated using a molecular dating method named RelTime (Tamura et al., ) with the JTT matrix-based model (Jones et al., ) as implemented in MEGA 7 (Kumar et al., ). All the parameters were kept as default. [...] Functional DNA sequences were retrieved first via BLASTn from NCBI nucleotide collection database under default settings (Supplementary ). The selected 56 sequences (24 hT2R10-like and 28 hT2R46-like) had at least 80% identity with hT2R10 or hT2R46 and length of at least 750 nucleotides. Underlying selective forces in the key positions responsible for strychnine recognition were identified by the site-specific ratios of dN vs. dS via Selecton Server (Stern et al., ). Default settings were applied except for changing the precision level from “medium” to “high.” […]

Pipeline specifications

Software tools BLASTN, Selecton
Application Population genetic analysis
Organisms Homo sapiens
Chemicals Strychnine