Computational protocol: Refining the Ambush Hypothesis: Evidence That GC- and AT-Rich Bacteria Employ Different Frameshift Defence Strategies

Similar protocols

Protocol publication

[…] All analyses were performed using custom Python 3.6 scripts with standard NumPy 1.8.0, SciPy 0.13, and Biopython 1.66 () libraries. Statistical analyses and data visualizations were performed using R 3.3.3 (). Scripts can be found at (https://github.com/la466/oscs). [...] Bacterial codon use is nonrandom. Highly expressed genes often prefer to use codons that are decoded by the most abundant tRNA (). The Codon Adaptation Index (CAI) () quantifies codon bias with high CAI values correlating with high expression in several organisms including E coli (). CAI is therefore used as a gene expression proxy.For each genome, a reference set of 20 genes from rplA/1—rplF/6, rplI/9—rplU/21 and rpsB/2—rpsU/21 were identified as highly expressed. The first 30 nucleotides were removed from the CDS (the 5’ CDS is biased to facilitate ribosome binding), and the first half of the CDS in this highly expressed set was used to calculate CAI indices using CodonW v1.4.4 (https://sourceforge.net/projects/codonw/; last accessed March 22, 2016) with the arguments “-coa_cu -coa_num 100%” to include all sequences in calculating indices. CAI values for the first half (minus the first 30 nucleotides) of the remaining CDS in the genome were calculated with the “-all_indices” argument using the generated fop_file, cai_file, and cbi_file. OSC densities were subsequently calculated using the second half of the CDS to prevent resampling of the same sequence for two measures for which codon usage is being measured and maximizing the independence of the data. […]

Pipeline specifications

Software tools Numpy, Biopython, CodonW
Applications Synthetic biology, Miscellaneous
Organisms Bacteria
Diseases Ataxia Telangiectasia
Chemicals Amino Acids, Nucleotides