Similar protocols

Pipeline publication

[…] tion involved purification of mRNAs by poly-T oligos attached to magnetic beads, and cDNA synthesis in two strands. The cDNAs were added with A bases at 3' ends and ligated with adapters for further PCR amplifications. The amplified cDNAs were sheared into 200 bp in length to make the sequencing library. The processed samples each in 10 ng were run in parallel at the platform Hiseq2000 of Beijing Genomics Institute (BGI). Paired-end reads of 75-nt were collected up to 1.2 giga bases (Gb) for each sample. Low quality reads were excluded from the raw data, and clean data sets (>1 Gb) were subject to the assembly pipeline described below., The cleaned data of each sample was assembled using Trinity of the latest version (trinityrnaseq_r2012-06-08.tgz). The command line followed: Trinity.pl --seqType fa --JM 40G --left A.Left --right A.Right --CPU 8. Details on the parameters may be found in the associated manual (http://trinityrnaseq.sourceforge.net/#sample_data). The resulting assemblies were subsequently labeled as A, B, C, D, E, and F (Table ), corresponding to the samples of Figure . The six samples were further combined to produce a final assembly using TGICL (TGICL-2.0.tar.gz) under the parameters: -p 95 -l 50 -v 6 -O '-h 3 -k 0 -o 50 -p 95'., We annotated the final scaffolds of the combined data to a reference database containing a total of 2961141 uniref50 sequences (ftp.ebi.ac.uk/pub/databases/uniprot/, January 2012) by blastx under the parameters of -e 1e-5 –b 10. The resulted annotations were then re-organized by our local Perl scripts (Additional file ) to extract information according to their GO (http://www.geneontology.org) and KEGG classifications (http://www.genome.jp/kegg/). The assembly of each sample was mapped back to the final assembly via Bowtie [] to obtain the reads distribution among the entries. When the read number for an entry was less than two, we considered the result fortuitous, and took the expression as zero., Following the original notations of the TMM (trimmed mean of log (base 2) expression ratio (M value)) method [], we applied the formula below to compute the normalization factor R:, (1) R = TMM k r , log TMM k r = ∑ g ∈ G * w gk r M gk r / ∑ g ∈ G * w gk r , M gk r = log Y gk N k / log Y gr N r , w gk r = N k - Y gk N k Y gk + N r - Y gr N r Y gr ≈ 1 Y gk + 1 Y gr , Here, G* is a trimmed set (de […]

Pipeline specifications

Software tools Trinity, TGICL, BLASTX, Bowtie
Databases UniRef