Computational protocol: An Alternative Strategy for Trypanosome Survival in the Mammalian Bloodstream Revealed through Genome and Transcriptome Analysis of the Ubiquitous Bovine Parasite Trypanosoma (Megatrypanum) theileri

Similar protocols

Protocol publication

[…] Trypanosoma theileri was isolated from a primary cell culture derived from a cow from the north west of England. Trypanosoma theileri was cultured invitro as in () and genomic DNA was extracted using Qiagen DNeasy Blood and Tissue kit. Isolated DNA was sequenced using a five library Illumina approach at the Beijing Genomics institute (www.genomics.cn/en/index; last accessed August 23, 2017). The number of reads, read length, and insert size of each library are shown in supplementary table S1, Supplementary Material online. Prior to assembly, reads were subject to quality filtering using trimmomatic () to remove low quality bases and read-pairs as well as contaminating adaptor sequences. Sequences were searched for all common Illumina adaptors (the default option) and the settings used for read processing by trimmomatic were “LEADING:10 TRAILING:10 SLIDINGWINDOW:5:15 MINLEN:50.” The quality filtered paired-end reads were then subject to assembly using ALLPATHS-LG () using the default program settings. The resulting assembly was subject to 32 rounds of assembly error correction and gap filling using Pilon () using the “–fix all” option and setting the expected ploidy to diploid. All filtered 91-bp paired-end reads were mapped to this assembly set using BWA-MEM (), and read-pairs that did not map to the assembly were isolated and assembled separately using SGA () using the default parameters. Contigs produced using SGA whose length was >1,000 bp were added into the original assembly and subject to iterative scaffolding using SSPACE (). This process of identifying unmapped reads, assembly of unmapped reads, and scaffolding was repeated until no further contigs >1,000 bp were produced. The final draft assembly contained 319 sequences with an N50 515 kb and a total assembly length of 29.8 Mb and an average coverage per assembled contig of ∼105×. This Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession NBCO00000000. The version described in this paper is version NBCO01000000. […]

Pipeline specifications

Software tools Trimmomatic, ALLPATHS-LG, Pilon, BWA, SSPACE
Databases DDBJ
Application De novo sequencing analysis
Diseases Infection