Computational protocol: The mitochondrial genome of booklouse, Liposcelis sculptilis (Psocoptera: Liposcelididae) and the evolutionary timescale of Liposcelis

Similar protocols

Protocol publication

[…] SeqMan (DNAStar) was used to assemble the four overlapping nucleotide sequences, which were further confirmed by manually inspection. The protein-coding and rRNA genes were identified using the program ORF Finder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html) and BLAST searches against the GenBank database, respectively. Subsequently, all of these genes were further confirmed by alignment with homologous genes from those of other booklice and lice species. The transfer RNA genes were identified by their cloverleaf secondary structure using ARWEN with default parameters and tRNAscan-SE 1.21 with Search Mode = EufindtRNA-Cove, Genetic Code = Invertebrate Mito and Cove score cutoff = 0.1. The base composition was analyzed with MEGA 5. Sequences of mt genomes of other booklice and lice were retrieved from GenBank (). [...] Eight species from the Psocoptera and thirteen species from the Phthiraptera were included in our phylogenetic analysis (). Two true bugs Alloeorhynchus bakeri and Halyomorpha halys were used as outgroups.Sequences of all mt protein-coding genes and rRNA genes except nad4, nad4L, nad2, atp8 were used in phylogenetic analysis. nad4L and atp8 were excluded because they are too short to align among the Psocodean species. nad4 and nad2 was excluded because it was not identified in the human pubic louse, Pthirus pubis or in the elephant louse, Haematomyzus elephantis. Two alignments were used for phylogenetic analyses: 1) a concatenated nucleotide sequence alignment of nine protein-coding genes and two rRNA genes; 2) a concatenated amino acid sequence alignment of nine protein-coding genes. Nucleotide sequences of all protein-coding genes and rRNA genes were aligned using the default settings in ClustalW as implemented in MEGA 5. Amino acid sequences of PCGs were also aligned in ClustalW. All of the alignments were then imported into the Gblocks server (http://molevol.cmima.csic.es/castresana/Gblocks_server.html) to remove poorly aligned sites. Gblocks server was applied with the ‘codons’, ‘DNA’ and ‘protein’ mode for PCG nucleotide sequences, rRNA sequences and PCG amino acid sequences, respectively, and with all options for a stringent selection were chosen.Subsequent analyses were performed on the combined dataset using Maximum likelihood (ML) and Bayesian inference (BI). BI was performed using MrBayes 3.2 and ML was performed using RAxML 7.7.1. For ML, the GTRGAMMA model was selected for the concatenated datasets, with 1000 bootstrap replicates. For BI, the best-fitting nucleotides models were chosen using PartitionFinder V1.1.1 as follows: TIM + I + G: cox1; GTR + I + G: atp6, cob, cox2, cox3, nad1, nad3, nad5; HKY + I + G: nad6; TVM + I + G: rrnL, rrnS; the best-fitting amino acids models were chosen as follows: MtArt + I + G: cox1; MtArt + I + G + F: cox2, cox3, cob, atp6, nad1, nad3, nad5, and nad6. Two independent sets of Markov chains were run, each with one cold and three heated chains for 1 × 107 generations, and every 1000th generation was sampled. Convergence was inferred when a standard deviation of split frequencies <0.01 was completed. Sump and sumt burninfrac was set to 25% and contype was set to allcompat. [...] We performed divergence date analyses based on the combined 11 mt genes dataset of Psocodean (). The molecular clock was calibrated using three minimum age constraints based on one fossil and two conclusions (100–145 Ma for the split between lice and Liposcelididae, 94–101 Ma for the split between Rhynchophthirina and Anoplura, and the ancestor of three human lice has been stable for at least 7 Ma). Analyses were performed using a relaxed molecular clock model in the Bayesian phylogenetic software BEAST 1.8.0. Rate variation was modeled among branches using uncorrelated lognormal relaxed clocks. A Yule speciation process was used for the tree prior and posterior distributions of parameters, including the tree, were estimated using MCMC sampling. We performed two replicate MCMC runs, with the tree and parameter values sampled every 5000 steps over a total of 50 million generations. A maximum clade credibility tree was obtained using Tree Annotator within the BEAST software package with a burn-in of 1000 trees. Acceptable sample sizes and convergence to the stationary distribution were checked using Tracer 1.5. […]

Pipeline specifications

Software tools Open Reading Frame Finder, ARWEN, tRNAscan-SE, MEGA, Clustal W, Gblocks, MrBayes, RAxML, PartitionFinder, BEAST
Applications Genome annotation, Phylogenetics
Organisms Liposcelis entomophila, Liposcelis decolor, Liposcelis paeta