Computational protocol: Transcription-coupled DNA supercoiling dictates the chromosomal arrangement of bacterial genes

Similar protocols

Protocol publication

[…] In order to investigate the conservation of TCDS effects we focused on the clade of Enterobacteriaceae. In this clade species haven't diverged to much to investigate a large set of conserved orthologous genes. Orthologous groups were determined using the software proteinortho (v5.07) with standard settings (E-value = 1e-5, blastp, α = 0.5, coverage=50%) and synteny option to separate co-orthologs. The set of Enterobacteriaceae was taken from the DoriC database () with each species being present once in the dataset. Protein sequences and annotation files were provided by the NCBI ftp site (ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/). The resulting core genome of all NCBI and DoriC listed Enterobacteriaceae comprised more than half of the E. coli genome (see Supplementary list). With the limitation to the core genes equal amounts of investigated genes for all species were ensured. From the orthologous genes we determined triads consisting of a target gene, the neighbors with strongest positive and strongest negative TCDS contribution on the target gene TCDS level. To investigate conservation of TCDS dependent gene regulation we selected the triads with either a strongly positive correlation (Pearson coefficient ≥0.9) or a strongly negative correlation (Pearson coefficient ≤ −0.9) of TCDS and temporal gene expression of the target gene and an at least a moderate average TCDS bias in time (|TCDS| ≥ 0.1) of the target gene in E. coli. In addition, only triads were included in the statistic where all three genes remained within 10 kb distance to each other. This ensured a plausible TCDS impact on the target gene. For each conserved gene triad we derived the conservation frequency of the TCDS sign (positive or negative TCDS) on the target gene, so we tested whether the qualitative impact of the genes on the target remained the same. A positive TCDS impact can be realized by an upstream tandem or a downstream convergent orientation of the neighbor toward the target. In contrast, a neighbor with a divergent upstream or tandem downstream orientation has a negative TCDS contribution on the target gene TCDS level. Subsequently, for each triad we separately counted the number of conserved impact occurrences of the two other genes in the triad on the target gene within the Enterobacteria. The two groups of positive and negative TCDS contribution within the triad were further split in a positive and negative impact on target transcription in E. coli yielding four groups. The type of impact on target transcription depends on the type of TCDS impact on the target gene and the reaction of the target gene expression on the TCDS level (Figure ). For instance, the impact on transcription was called positive if the gene with a positive impact on target gene TCDS in E. coli (DNA relaxation) would belong to a target gene that prefers relaxation, hence showed a positive correlation of TCDS and transcription. If the majority (>50%) of orthologs showed a conservation in the TCDS impact we counted the gene as conserved for the respective group, otherwise as not conserved. Summing up these counts over all investigated gene triads, we obtained a statistic of the number of conserved TCDS effects on orthologous genes in Enterobacteriaceae. […]

Pipeline specifications

Software tools Proteinortho, BLASTP
Applications Genome annotation, Phylogenetics
Organisms Escherichia coli, Streptococcus pneumoniae