Computational protocol: Two Goose-Type Lysozymes in Mytilus galloprovincialis: Possible Function Diversification and Adaptive Evolution

Similar protocols

Protocol publication

[…] The searches for nucleotide and protein sequence similarities were performed with the BLAST algorithm (http://www.ncbi.nlm.nih.gov/blast). Multiple alignments were conducted with the ClustalW program (http://www.ebi.ac.uk/clustalw/). The deduced protein sequences were analyzed with ExPASy (http://www.expasy.org/). Signal peptide was predicted by SignalP 4.0 server (http://www.cbs.dtu.dk/services/SignalP/). Repeated gene sequences were identified by running the Tandem Repeats Finder program (http://tandem.bu.edu/trf/trf.html). Prediction of putative disulfide bonds was performed using the Dianna 1.1 web server (http://clavius.bc.edu/~clotelab/DiANNA/). The PSIPRED Protein Structure Prediction Server (http://bioinf.cs.ucl.ac.uk/psipred/) was used to predict the secondary structure. The three-dimensional structure of MGgLYZ1 and MGgLYZ2 were predicted by SWISSMODEL (http://swissmodel.expasy.org/workspace) based on the crystal structure of g-type lysozyme from Atlantic cod (PDB ID: 3mgw and 3gxr). The reliability of modeled structure was validated by Ramachandran plot analyses using PROCHECK (http://nihserver.mbi.ucla.edu/SAVES/). A phylogenetic tree of gLYZs was constructed with Mega4.1 software using the neighbor-joining (NJ) method. Bootstrap analysis was used with 1000 replicates to test the repeatability. Bayesian phylogenetic tree of mollusk gLYZs was generated from coding sequences with MrBayes 3.1.2. The optimal model of DNA sequence evolution was selected using jModelTest package 0.1.1. [...] For polymorphism detection in coding region of MGgLYZ1 and MGgLYZ2, a total of 18 adult M. galloprovincialis collected from three geographic origins (Qingdao, Yantai and Dalian; ) were used. During the experiment, the mussels were cultured as described above in the acclimatization period. No specific permits were required for the described field studies. The challenge experiment was conducted according to the method described by Schmitt et al . For each geographic population, 50 µl live Micrococcus luteus (1×107 CFU mL−1) and V. anguillarum (1×107 CFU mL−1) were injected into the adductor muscle respectively. Then the hemocytes and hepatopancreas of six mussels from each geographic origin (total 18 individuals) with different treatment were sampled at 96 h post challenge. Total RNA from each tissue (total 72 samples) was immediately extracted and subjected to reverse transcription and PCR amplification. The PCR amplification was conducted with Pfu DNA polymerase. The primers used to amplify the coding regions were shown in . The PCR products were cloned into pMD-18T simple vector (Takara, Japan) and transformed into the competent cells of E. coli Top 10 F′. One positive clone of each sample was sequenced in both directions on an ABI3730 Automated Sequencer (Applied Biosystem, USA) by Chinese National Human Genome Center (SinoGenoMax). The coding regions of MGgLYZ1 (72 positive clones) and MGgLYZ2 (70 positive clones, two sample with no positive amplification) were sequenced respectively.The nucleotide sequences encoding amino acids of the MGgLYZ1 (72 sequences) and MGgLYZ2 (70 sequences) were used to construct the NJ phylogeny trees with Kimura 2-parameter model respectively. The reliability of interior branches of each phylogeny was assessed with 1000 bootstraps. The phylogeny was used to estimate nonsynonymous to synonymous rate ratio (ω = dN/dS) by the maximum likelihood (ML) method implemented in CODEML program of the PAML 4.4 software package . Positive selection can be inferred from a higher proportion of nonsynonymous than synonymous substitutions per site (dN/dS>1). Likelihood ratio tests (LRTs) were used to determine whether any codon positions were subjected to positive selection as indicated by ω>1.To test for heterogeneous selective pressure at amino acid sites, the site-specific models were tested: M0 (one-ratio) against M3 (discrete), M1a (nearly neutral) against M2a (positive selection), M7 (beta) against M8 (beta & ω). The M0 (one-ratio) model assumes the same ω value for the entire tree. The M3 (discrete) model uses a general discrete distribution with three site classes, with the proportions p0, p1, and p2 and the ω ratios ω0, ω1, and ω2. The M1a model estimates single parameter, p0, with ω0 = 0, and the remaining sites with frequency p1 (p1 = 1–p0) assuming ω1 = 1. The M7 model assumed a beta distribution for the ω values between 0 and 1. M1a and M7 belong to null models that do not allow for any codons with ω>1, while M2a and M8 represent more general models that do. The LRTs between nested models were conducted by comparing twice the difference of the log-likelihood values (2ΔL) between two models with the χ2 distribution (df = 2). The Naive Empirical Bayes (NEB) method and Bayes empirical Bayes (BEB) method were used to calculate the posterior probabilities that each codon is from the site class of positive selection under models M3, M2a and M8 respectively . […]

Pipeline specifications

Software tools Clustal W, SignalP, TRF, DiANNA, PSIPRED, PROCHECK, MrBayes, jModelTest, MUSCLE, PAML
Databases ExPASy
Applications Phylogenetics, Protein structure analysis, Nucleotide sequence alignment
Organisms Mytilus galloprovincialis, Pseudomonas putida