Computational protocol: Structural and functional adaptation of Haloferax volcanii TFEα/β

Similar protocols

Protocol publication

[…] Sequencing reads were trimmed to remove the adaptor parts using Trimmomatic () and the quality of the sequencing data was assessed using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc). Reads were mapped to the Hvo genome using TopHat () using parameters –no-novel-juncs –no-mixed –library-type fr-first strand with filtering out reads mapping to tRNA and rRNA genes as well as reads that do not map uniquely. Principal component analysis (PCA) was used to assess biological replicate concordance. RPKM values (reads per kb per million) were calculated using HTseq-count. Data were analyzed using DESeq2 to identify significant differentially expressed genes (Padj < 0.01) ().In order to calculate transcript abundance, we combined an available transcriptome map of Hvo () with recently published transcription start site (TSS) mapping data () to create an updated map of Hvo mRNA transcriptome. For 1331 transcription units (TUs) the 5′-end coordinates were adjusted based on mapped primary TSSs. For these cases, the median adjustment was by 5 bp, whereas in 117 cases an adjustment of >100 nt was required, in the vast majority leading to an extension of the TU. Six annotated polycistronic TUs harboured additional internal primary TSSs. In these cases the TU annotation was revised and shortened to the primary TSS position. 62 protein encoding genes with assigned TSSs were not included in the existing TU map (). This updated map () was used to estimate expression levels using RSEM in paired end mode and –strandedness reverse for libraries prepared with the dUTP method ().The same transcription start site mapping data were used for promoter analysis. DNA sequences were extracted using BedTools getfasta () for positions –50 to +10 relative to the TSS and analyzed using the MEME software version 4.11.4 for motif identification (0 or 1 occurrence per sequence, 4–16 bp width, searching given strand only) (). A 0-order background model based on the composition of the combined H. volcanii DS2 genome was employed. Aligned promoter sequences were depicted using WebLogo3 with correction for the genome composition (). Synonymous codon usage data for Hvo genes have been described previously (). AT% of genes was calculated using BedTools nuc (). Data were plotted using R (http:/www.R-project.org) with the LSD package v3.0. […]

Pipeline specifications

Software tools Trimmomatic, FastQC, TopHat, HTSeq, DESeq2, RSEM, BEDTools, WebLogo
Applications RNA-seq analysis, Genome data visualization
Organisms Haloferax volcanii
Chemicals Zinc