Computational protocol: Investigating host dependence of xylose utilization in recombinant Saccharomyces cerevisiae strains using RNA-seq analysis

[…] The sequence files in FASTQ format were analyzed using the Galaxy software ( Briefly, the files were groomed to make sure the quality-scores line in the files use Sanger-scaled quality values with ASCII offset 33. The RNA-seq paired-end reads were mapped into transcripts using TopHat by setting the reference genome as S. cerevisiae (sacCer3, UCSC). The transcripts were assembled and the FPKM (fragments per kilobase of exon per million fragments mapped) were estimated using Cufflinks with the default parameter settings, followed by transcripts merge using Cuffmerge. The assembled transcripts between control group and experimental group were compared using Cuffdiff, with cutoff p-value set as 0.05. The transcripts, of which the FPKM were identified as significantly different between the control group and experimental group, were picked and searched in the genome browser of BioCyc database to identify the specific genes included in the transcripts. The transcripts cluster analysis was achieved by using ‘clustergram’ of MATLAB (MathWorks, Natick, MA, USA). The gene ontology analysis was performed by using generic GO term mapper developed by Princeton University ( […]

Pipeline specifications

Software tools Galaxy, TopHat, Cufflinks
Application RNA-seq analysis
Organisms Saccharomyces cerevisiae
Diseases Substance-Related Disorders
Chemicals Ethanol, Xylose