Translated open reading frame detection software tools | Ribosome profiling data analysis
Ribosome profiling via high-throughput sequencing (ribo-seq) is a promising new technique for characterizing the occupancy of ribosomes on messenger RNA (mRNA) at base-pair resolution. The ribosome is responsible for translating mRNA into proteins, so information about its occupancy offers a detailed view of ribosome density and position which could be used to discover new translated open reading frames (ORFs), among other things.
A rigorous statistical approach that identifies translated regions on the basis of the characteristic three-nucleotide periodicity of Ribo-seq data. We used RiboTaper with deep Ribo-seq data from HEK293 cells to derive an extensive map of translation that covered open reading frame (ORF) annotations for more than 11,000 protein-coding genes. We also found distinct ribosomal signatures for several hundred upstream ORFs and ORFs in annotated noncoding genes (ncORFs). Mass spectrometry data confirmed that RiboTaper achieved excellent coverage of the cellular proteome. Although dozens of novel peptide products were validated in this manner, few of the currently annotated long noncoding RNAs appeared to encode stable polypeptides. RiboTaper is a powerful method for comprehensive de novo identification of actively used ORFs from Ribo-seq data.
Allows users to detect open reading frames (ORFs) using ribosome profiling experiments embedded in a pipeline for data analysis. PRICE includes a statistical model of the ribo-seq experiments and interprets actively translated codons using maximum likelihood for all reads. This pipeline implements all steps necessary to identify and score codons and ORFs from ribo-seq experiments. It includes also modules to pre-process and map sequencing reads.
Allows users to identify and quantify translation from protein-coding DNA sequences (CDSs) regardless of start codon. ORF-RATER makes the assumption that translated ORFs display a pattern of ribosome occupancy that mimics that of annotated genes. This tool is based on linear regression, which naturally integrates multiple lines of evidence simultaneously. Also, it enables each open reading frame (ORF) to be evaluated in the context of nearby and overlapping ORFs.
Represents an unsupervised Bayesian approach. Rp-Bp allows users to identify translated open reading frames (ORFs) based on ribosome profiles. It detects all translated ORFs which exhibit this pattern. This tool consists of two steps: (1) it constructs a profile for each ORF from ribo-seq reads, (2) it involves the prediction of ORF translation from the profiles using a different variant of the two component mixture model.
Deduces intrinsic characteristics from data and thus can be applied to Ribo-seq data targeting elongating ribosomes. REPARATION is an application that trains an ensemble classifier to learn Ribo-seq patterns from a set of confident protein coding open reading frames (ORFs) for a de novo delineation of translated ORFs in bacterial genomes. It also able to identify a multitude of putative coding ORFs corresponding to previously annotated protein coding regions next to ORFs residing in so called non-protein coding regions.
Allows the assessment of non-canonical translations such as overlapping open reading frames (ORFs). RiboCode is a statistically method for the de novo annotation of the full translatome by quantitatively assessing the 3-nt periodicity. Its workflow is composed of 3 steps: 1) preparing the transcriptome for search of the candidate ORFs, 2) determining the length range of the ribosome protected RNA fragments (RPF) reads that are most likely to have resulted from active translation, and identifying the P-site positions in these reads, and 3) assessing the active translation event via statistical comparisons among the 3 vectors representing the RPF read densities in and off the reading frame along each candidate ORF.
Allows users to detect actively translated small open reading frames (smORFs). ORFscore quantifies the biased distribution of ribosome-protected fragments (RPFs) toward the first frame of a given conserved coding sequence (CDS). ORFscore is able to quantify the number of RPFs in each frame and determines whether RPFs were uniformly distributed or preferentially accumulated in one frame.
Measures the magnitude of disagreement between these two distributions taking into account lower scores reflecting higher similarity. FLOSS is able to identify effect of the small number of lncRNAs yielding substantial non-ribosome-associated fragments. It can analyze and distinguish individual annotated coding sequences and non-coding transcripts. This method permits users to predict the results of ribosome affinity purification, which separate true footprints from background RNA by physical rather than computational means.
Allows users to identify translated coding sequences (CDSs) by leveraging both the total abundance and the codon periodicity structure in ribosome-protected RNA fragments (RPFs). riboHMM utilizes data about ribosome footprint to deduct translated sequences. This method is particularly useful to identify CDSs in the transcriptome of human lymphoblastoid cell lines (LCLs), or to detect high-confidence translated CDS.
A spectral coherence-based classification algorithm to identify regions of active translation using aligned ribosome profiling sequence reads. SPECtre leverages a key feature of ribosome profiling where sequence reads aligned to a reference transcriptome will track the tri-nucleotide periodicity characteristic of transcripts as they are translated by ribosomes. Options to change the step size between windows, the size of windows analyzed, false discovery rate and abundance cutoffs to differentiate translated versus non-translated distributions are provided to the end-user to customize. A comparison of SPECtre against current methods on existing and new data shows a marked improvement in accuracy for detecting active translation and exhibits overall high sensitivity at a low false discovery rate.
Offers a collection of tools for Ribo-seq data analysis. RiboProfiling provides a unique, straightforward R implementation of a ribosome profiling pipeline from BAM, to P-site calibration, quantification of reads on sequence features, and codon coverage. The packages’ graphical features offer quality assessment and result representation across the analyses. Following the overview of Ribo-seq experiments with ’RiboProfiling, the output tables can then be easily integrated into more specialized dowstream analyses.
Assists users in the identification of functional upstream open reading frames (uORFs). uORF-Tools is a program using some ribo-some profiling data for determining the experiment-specific translation-regulatory uORFs. This solution can serve for ribosome profiling data as well as a de novo annotation of actively translated uORFs. Furthermore, it can be used for detecting changes in translation of uORFs and their associated main ORFs.
Exploits the subcodon ribosome protected fragment (RPF) periodicity signature to identify mRNA transcripts with putative reading frame transitions. CSCPD allows dually coded regions to be detected only if the alternative frame has RPF coverage comparable to, or higher than, that of the standard frame. The tool prediction capabilities depend on the number of RPFs that can be aligned to an mRNA sequence.
Allows users to analyze ribosomal profiling data and identify translated open reading frames (ORFs). RibORF is useful for the detection of ORFs that combines alignment of ribosomal A-sites, 3-nt periodicity, and uniformity across codons. It is able to distinguish in-frame ORFs from overlapping off-frame ORFs, and it recognizes reads arising from RNAs that are not associated with ribosomes.