Computational protocol: The Complete Mitochondrial Genome of the Booklouse, Liposcelis decolor: Insights into Gene Arrangement and Genome Organization within the Genus Liposcelis

Similar protocols

Protocol publication

[…] SeqMan (DNAStar) was used to assemble the two overlapping nucleotide sequences, which were further confirmed by manually inspection. The protein-coding and rRNA genes were identified using the program ORF Finder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html) and BLAST searches against the GenBank database, respectively. Subsequently, all of these genes were further confirmed by alignment with homologous genes from those of other louse and booklouse species. The transfer RNA genes were identified by their cloverleaf secondary structure using ARWEN with default parameters and tRNAscan-SE 1.21 with Search Mode  =  “EufindtRNA-Cove”, Genetic Code  =  “Invertebrate Mito” and Cove score cutoff  =  0.1. The stem-loop secondary structure of the putative control regions was folded using the Mfold Server under the RNA folding option with default parameters. The base composition and codon usage were analyzed with BioEdit (http://bioedit.software.informer.com/) and DAMBE 5.3.9 . Sequences of mt genomes of other lice were retrieved from GenBank and MitoZoa () . [...] Phylogenetic analyses were conducted with the 11 Psocodea mt genome sequences currently available in GenBank including the new booklouse sequence obtained in this study. The mt genome sequence of the fruit fly, Drosophila melanogaster, served as an outgroup. Sequences of atp8, nad4L, and tRNA genes were too short and too variable to be correctly aligned among the psocodean species; these genes were thus excluded from the phylogenetic analyses. The nad4 was also excluded as this gene has not been identified in the human pubic louse, P. pubis . The amino acid sequences from each protein-coding gene and the nucleotide sequence of each rRNA gene were aligned with MAFFT v7 . The nucleotide sequences of each protein-coding gene were aligned based on the corresponding amino acid alignments using PAL2NAL to ensure the correct reading frame; the poorly aligned sites were removed with GUIDANCE using the default setting. Then, positions with gap in more than half of the species were removed. Substitution saturations of the nucleotide sequences were examined using DAMBE 5.3.9 following Xia et al. . Whole PCG sequences were chosen to enter the next step if Iss (index of substitution saturation) is significantly lower than Iss.c (critical value for symmetrical tree topology) (P < 0.05). All of the protein-coding and rRNA genes, except nad3 and nad6, passed this test. Consequently, the third codon positions of nad3 and nad6 were excluded from phylogenetic analyses. The best fit models for the alignment of nucleotide sequence and amino acid sequence were determined using the Akaike Information Criterion in jModelTest 0.1.1 and ProtTest 3 , respectively. Specifically, the GTR+I+G model and MtREV+I+G model were chosen for the nucleotide sequence dataset and the amino acid sequence dataset, respectively. Phylogenetic trees were estimated via Bayesian inference (BI) method using MrBayes v3.12 . Four independent Markov chains were simultaneously run for 2,000,000 generations with a heating scheme (temp  =  0.2). Trees were sampled every 100 generations (sample-freq  = 100) and the first 25% of the generations were discarded as burn-in and the remaining samples were used to compute the consensus tree. Stationarity was considered to be reached when the average standard deviation of split frequencies was below 0.01 . […]

Pipeline specifications