Similar protocols

Pipeline publication

[…] se-free DNase I (New England Biolabs) for 30 min at 37°C to remove the residual DNA., Beads with oligo(dT) were used to purify poly(A) mRNA from total RNA. Then, the mRNA was fragmented using a RNA fragmentation kit (Ambion). First-strand cDNA was synthesized using random hexamer-primer and reverse transcriptase (Invitrogen), and second-strand cDNA was synthesized next. Then the paired-end cDNA library was prepared in accordance with Illumina's protocols with an insert size of 200 bp and sequenced for 75 bp. The Illumina GA processing pipeline v0.2.2.6 was used to analyze the image and for base calling., As no optimal k-mer length is appropriate for all de novo transcriptome assemblies, the multiple k-mer method was used to obtain longer silver carp mRNA sequences, which are very useful in subsequent analysis steps. Our method is based on the modified ‘additive Multi-k’ method described by Yann Surget-Groba After removing reads with the sequencing adapter and reads of low quality, paired-end reads were subjected to de novo assembly using ABySS with k-mer lengths of 58, 54, 52, 50, 48, 46, 44, 42, 40, 38 and 34. The unused reads at higher k-mer lengths were not discarded before running the assembly for a lower k-mer length. The output data set of each k-mer length was subjected to SSPACE for scaffolding, respectively. When pooling all the results together, some contigs and scaffolds appeared in two or more assemblies, causing redundancy. These were removed using CD-HIT-EST. The longest possible contigs and scaffolds were retained. At last, the STM+ method was used to perform translation mapping scaffolding with the Danio rerio proteome serving as a reference., The assembled sequences were blasted against the NCBI Nr (non-redundant) protein database and Swiss-prot database using BLASTX and an E-value of 1e−5. To shorten the search time, searches were limited to the first 10 significant hits for each query. Gene names were assigned to each sequence according to its best BLAST hit (highest score)., The Blast2GO suit was used for functional annotation of assembled sequences applying the function for the mapping of gene ontology (GO) terms to sequences with BLAST hits obtained from hits with E-value < 1e−5, annotation cut-off > 55 and a GO weight > 5 were used for annotation. Assembled sequences were thus assigned to primary and sub-GO functional categories., A microsatellite program (MISA) ( was used to identify and localize microsatellite motifs. We searched for all types of simple sequence repeats (SSRs) from mononucleotide to hexanucleotides using the following parameters: at least 10 repeats for mono-, 6 repeats for di- and 5 repeats for tri- […]

Pipeline specifications

Software tools STM, ABySS, SSPACE, CD-HIT, BLASTX, Blast2GO