Computational protocol: The Drosophila melanogaster methuselah Gene: A Novel Gene with Ancient Functions

[…] tBlastn searches were performed at FlyBase ( using as a query the D. melanogaster Methuselah protein (AAF47379.2), and the 12 publicly available and annotated Drosophila genomes. All retrieved sequences are annotated in FlyBase as mth-like sequences. Accession numbers are given in the figures showing the phylogenetic results (see results section). The same procedure was used to identify one mth-like sequence in Daphnia pulex (see results). [...] Phylogenetic analyses were performed using the ADOPS pipeline . When ADOPS is used, nucleotide sequences are first translated and aligned using the amino-acid alignment as a guide. In order to determine how the choice of a given alignment algorithm influences the phylogenetic reconstruction, three separate analyses were performed using the ClustalW2, MUSCLE and T-Coffee alignment algorithms as implemented in T-Coffee . It should be noted that when using ADOPS, only codons with a support value above two are used for phylogenetic reconstruction. Alignments of the mth-like copies of D. melanogaster resulting from MUSCLE and ClustalW2 were tested for nucleotide substitution saturation by plotting the observed number of transitions and transversions against the genetic distance (F84) as implemented in DAMBE v. 5.3.15 ().Bayesian trees were obtained using MrBayes 3.1.2 as implemented in the ADOPS pipeline. The model of sequence evolution implemented in the analyses was the GTR, allowing for among-site rate variation and a proportion of invariable sites. Third codon positions were allowed to have a gamma distribution shape parameter different from that of first and second codon positions. Two independent runs of 2,000,000 generations with four chains each (one cold and three heated chains) were set up. The average standard deviation of split frequencies was always about 0.01 and the potential scale reduction factor for every parameter about 1.00 showing that convergence has been achieved. Trees were sampled every 100 th generation and the first 5000 samples were discarded (burn-in). The remaining trees were used to compute the Bayesian posterior probabilities of each clade of the consensus tree. A BI (Bayesian inference) analysis was performed with each alignment (obtained with ClustalW2 and MUSCLE) as input. The sequence of the related gene cirl was used as outgroup, as in Patel et al. (; see also the results section). The two alternative BI phylogenetic trees obtained with each alignment were compared using the Approximately Unbiased (AU) test as implemented in the program CONSEL v0.1j . [...] Mth ectodomain theoretical models were obtained using the I-TASSER server ( The highest TM-score value that is obtained when performing a structural alignment using the TM-align algorithm ( was used to build a distance matrix by subtracting these values from one. The resulting matrix was used to build an UPGMA tree using Mega5 . [...] The D. americana F2 association study using strains H5 and W11, described in detail in , was used to test for associations between variation at the GJ12490 orthologous gene and developmental time, abdominal size and longevity. The first trait to be measured was developmental time. For this purpose, each of the 83 second generation crosses (F1) were transferred to new flasks every day in order to obtain the precise period of time between oviposition and adult emergence. The resulting F2 males were then individually collected. When F2 males were 10 days old (young adult flies), individual chill-coma recovery times were measured at +25°C after four hours of cold exposure at 0°C. Flies must be able to stand up on their legs in order to be considered completely recovered. Individual photographs were taken when individuals were 20 days old, using a stereomicroscope Nikon ZMS 1500 H. The resulting JPG files were saved with a resolution of 1600×200 pixels. Relative abdominal size was estimated by counting the number of pixels in the picture that correspond to this structure, using Adobe Photoshop H. The flies were then transferred to new vials and kept until they died, in order to measure lifespan. Only males were used in order to avoid potential confounding effects caused by differences between sexes for the traits being studied (see for instance ). 453 F2 D. americana males showing extreme phenotypes (after excluding the individuals that show at least two phenotypic values in the second third of the distribution; the phenotypes that were considered are developmental time, chill-coma recovery time, abdominal size and lifespan), that are the descendants of three F0 H5 x W11 crosses (named crosses A, B and C), were selected out of 975 individuals.In this experiment isofemale rather than isogenic strains were used. Therefore, we sequenced a short fragment of the Mth ectodomain identified in GJ12490 gene in order to check for segregating polymorphisms within strains. The F0 individuals used were screened by direct sequencing of the amplification products obtained with primers Mth29_nsyn_F (TGCTAACACTGCTATTTCTA) and Mth29_nsyn_R (GCGTGATGACCGTTTTGT) using standard PCR conditions with an annealing temperature of 52°C. Amplification products were purified using Gel Extraction kit from QIAGEN (Izasa Portugal, Lda.). Sequencing was performed using ABI PRISM Big Dye cycle-sequencing kit version 1.1 (Perkin Elmer, CA, USA) and primers Mth29_nsyn_F and Mth29nsyn_seqR (CGTGCGTTCATTGCTGTC). Sequencing runs were performed by STABVIDA (Lisbon, Portugal). Given the evidence for heterozigosity in cross A, only crosses B and C were used. The F0 individuals used in crosses B and C, in the sequenced region, differ only at one putative highly conserved N-glycosilation amino acid site of the Mth ectodomain of the protein encoded by the GJ12490 orthologous gene. Since there was no restriction enzyme available to type the difference at the N-glycosilation amino acid site, a molecular marker in the close vicinity was used to follow this difference in the F2 individuals. The genomic region with 697 bp was amplified using standard PCR conditions and primers Mth_29_F (GTTCTTTCCGAGCAGCAA) and Mth_29_R (CAGAGCACACAGCAGAGC) with an annealing temperature of 53°C. The PCR products were then digested with the restriction enzyme Sau3AI and typed as 0 (undigested), 1 (completely digested) and 0/1 (heterozygous).All statistical tests and summary statistics were computed using the software SPSS Statistics 17.0 (SPSS Inc., Chicago, Illinois). Linear regression analyses (including a constant) were performed in order to estimate the percentage of variation in developmental time, lifespan and size that can be explained by the common amino acid polymorphism at the D. americana GJ12490 orthologous gene. This may be an overestimate since, as noted above, we used only males showing extreme phenotypes for developmental time, chill-coma recovery time, abdominal size and lifespan. […]

Pipeline specifications

Software tools TBLASTN, Clustal W, T-Coffee, DAMBE, MrBayes, CONSEL, I-TASSER, MEGA, SPSS
Databases FlyBase
Applications Miscellaneous, Phylogenetics, Amino acid sequence alignment
Organisms Drosophila melanogaster, Drosophila virilis