Computational protocol: Genomic insights into Wnt signaling in an early diverging metazoan, the ctenophore Mnemiopsis leidyi

Similar protocols

Protocol publication

[…] Mnemiopsis genomic DNA was collected from the self-fertilized spawning of two separate adult animals. One pool of genomic DNA was used to construct a library for 454 sequencing and the other used for Illumina paired-end sequencing. The 454 sequencing resulted in 8.1 million reads (2.7 Gb), which were assembled into contigs using the Phusion assembler []. The Illumina run resulted in 2.8 million paired end reads, which combined with the 454 data, was used to generate 5,100 scaffolds (scaffold N50 of 187 kb), resulting in a total coverage of ~50×.The Mnemiopsis genome was scanned in silico for genes of interest using a reciprocal BLAST approach. Human, frog, Drosophila and Nematostella orthologs were used as queries for TBLASTN searches. Candidate matches were then used in BLASTP searches of the human genome to find the closest hit. If the closest match was not the original ortholog or if the E-value was greater than 0.001, then it was coded as being absent from the genome. A gene model was created by scanning the genomic region using Genscan []. This predicted protein sequence was then searched for conserved Pfam domains using SMART []. For certain genes of interest, gene-specific primers were designed for RACE PCR (MacVector, Cary, NC, USA). RACE PCR fragments were then conceptually spliced and aligned back to genomic contigs for comparison of exon-intron boundaries, using Sequencher (Gene Codes, Ann Arbor, MI, USA). [...] The Mnemiopsis predicted amino acid sequences were aligned with the sequences of other organisms. The predicted domains or regions of interest were trimmed and aligned using Muscle, then corrected by hand for alignment errors (see Additional file , Additional file ). Bayesian phylogenetic analyses were performed using MrBayes 3.1.2 [] using the 'mixed' amino acid model with four independent runs of 5 million generations each, sampled every 100 generations with four chains. A summary consensus tree was produced in MrBayes from the last 49,000 trees of each run (196,000 trees in total), representing 4,900,000 stationary generations. Posterior probabilities were calculated from this consensus. Maximum likelihood analyses were performed using PhyML [], using the WAG model with 1000 bootstraps. Alignments and nexus files are available upon request. […]

Pipeline specifications

Software tools Phusion, TBLASTN, BLASTP, GENSCAN, MacVector, Sequencher, MrBayes, PhyML
Databases Pfam
Applications Phylogenetics, Amino acid sequence alignment
Organisms Mnemiopsis leidyi