Gene fusion identification software tools | RNA sequencing data analysis
Gene fusions arising from chromosomal translocations have been implicated in cancer. RNA-seq has the potential to discover such rearrangements generating functional proteins (chimera/fusion). Recently, many methods for chimeras detection have been published.
Maps short sequencer reads to reference genomes. Segemehl is a read aligner that allows to detect mismatches, insertions and deletions. The software implements a matching strategy based on enhanced suffix arrays (ESA): it aims to find the best-scoring seed for each suffix of a read. The tool lack, which rescues unmapped RNAseq, reads and works in conjunction with segemehl and many other frequently used split-read aligners, is distributed together with it.
Assists users in mapping reads to a reference genome. Subread offers a suite of programs for processing next-generation sequencing read data. This package includes Subread (an aligner), Subjunc (an aligner), Sublong (a long-read aligner), Subindel (a long indel detection program), featureCounts (a read quantification program), exactSNP (an SNP calling program) and other utility programs.
A computational framework for the discovery of gene fusions in paired end RNA-seq data. It is able to generate synthetic gene fusions by using the EricScript simulator and calculate a number of statistical measures for evaluating gene fusion detection methods' performance with EricScript CalcStats.
Finds fusion events by aligning the relatively short reads from next-generation sequencers. TopHat-Fusion employs unspliced alignment software and finding paired reads that map to either side of a fusion boundary to proceed. It can discover individual reads that span a fusion event and events involving novel splice variants and entirely novel genes. This tool is able to retrieve intra-chromosomal rearrangements while excluding most but not all read-through transcripts.
An RNA-Seq mapping software tool that include the discovery of transcriptomic and genomic variants like splice junction, chimeric junction, SNVs, Indels in a single analysis step using a built-in error detection method enabling high precision and sensitivity. CRAC is not a pipeline, but a single program that can replace a combination of Bowtie, SAMtools, and TopHat/TopHat-fusion, and can be viewed as an effort to simplify NGS analysis.
Infers candidate fusion transcripts by analyzing paired-end RNA-Seq data. FusionSeq is a standalone software that allows users to detect known and novel fusions and prioritize the candidates thanks to several statistics. The application includes tools to summarize and integrate the results into a web browser for visualization as well as a function for determining exact sequences at breakpoint junctions.
Allows users to discover fusion transcripts and underlying complex genomic rearrangements (CGRs) in breast cancer cell line HCC1954 and a primary prostate tumor sample 963. nFuse identifies two types of CGRs: closed chains of breakage and rejoining (CCBRs) and polyfusions/complex breakpoints. This tool can be used for the detection of the single breakpoints underlying fusion transcripts caused by more simple rearrangements.
Allows identification of gene fusions. INTEGRATE uses both RNA-seq and whole genome sequencing (WGS) encompassing and spanning reads to focus on the discovery of expressed gene fusions. It utilizes mapped and unmapped RNA-seq reads followed by analysis of WGS reads from tumor. This tool utilizes discordant RNA-seq reads to construct a gene fusion graph connecting genes involved in a putative fusion event.
A pipeline designed for analyzing RNA-seq data from tumor samples. As an unified pipeline, TRUP is designed to sensitively and accurately dissect the complexity of the cancer transcriptome by analyzing RNA-seq data obtained from tumour tissues. The current functionalities of TRUP include: 1) identification of fusion transcripts; 3) RNA-seq quality assessment; 2) Gene-read counting. The fusion detection module in TRUP combines split-read/read-pair mapping with regional de-novo assembly to achieve a balance between sensitivity and precision.
Predicts transcriptomic structural variants (TSVs) from RNA-seq data. SQUID is a computational tool that divides the reference genome into segments and builds a genome segment graph from both concordant and discordant RNA-seq read alignments. It can detect both fusion-gene events and TSVs incorporating previously non-transcribed regions into transcripts. Using an integer linear program rearranges the segments of the reference genome so that as many read alignments as possible are concordant with the rearranged sequence.
Uses paired-end transcriptome sequencing data to recognize fusion transcripts. SnowsShoes-FTD integrates several subroutines to build template regions for polymerase chain reaction (PCR) primer design. It offers a way to simplify quick PCR validations. This tool is also able to construct template regions for acid sequences of the putative in-frame fusion gene products, to ease predictions concerning the functional significance of the fusion events.
Inspects the k-mer contents of the RNA-Seq paired-end reads for fusion transcript detection. ChimeRScope is an alignment-free method which uses large-scale fusion transcript data analysis. It could lead to the discovery of potentially novel and physiologically relevant drug targets for cancer treatment, or biomarkers for effective diagnosis and prognosis in precision medicine. This tool can either be set up as a standalone software or installed on genomic research platforms such as Galaxy server.
A meta-caller algorithm by combining top performing methods to re-prioritize candidate fusion transcripts with high confidence that can be followed by experimental validation. Top performing methods likely had complementary advantages to accurately detect different types of fusion events. First of all, we selected fusion events detected by at least a certain number of tools. We next ranked the detected fusion events from each method by the number of supporting reads. Rank sums of the selected fusion events were calculated and the fusion events were reprioritized accordingly.
An innovative hybrid sequencing approach to detect fusion genes, determine fusion sites and identify and quantify fusion isoforms. IDP-fusion is the first method to study gene fusion events by integrating third generation sequencing long reads and second generation sequencing short reads.
A high accurate method for detecting intragenic and intergenic non-co-linear (NCL) transcripts. NCLscan utilizes a stepwise alignment strategy to almost completely eliminate false calls (>98% precision) without sacrificing true positives. NCLscan outperforms 18 other publicly-available tools (including fusion- and circular-RNA-detecting tools) in terms of sensitivity and precision, regardless of the generation strategy of simulated dataset, type of intragenic or intergenic NCL event, read depth of coverage, read length or expression level of NCL transcript. NCLscan promises to facilitate the comprehensive characterization of various types of NCL transcripts on a transcriptome-wide scale.
A package using paired reads and mapping to different genes (Bridge reads), to build a data set of candidate fusion events. FusionAnalyser is a graphical, eventdriven tool which makes use of paired-end short-read transcriptome sequences in human cancer to initially detect and annotate the presence of fusion rearrangements and then to identify the potentially driver events.
Identifies transposon insertions from splicing events between endogenous genes and the transposon. IM-Fusion identifies exactly which gene(s) are affected by a transposon insertion and how the transposon is incorporated into the resulting gene transcript. This approach will significantly enhance the accuracy of cancer gene discovery in forward genetic screens and prioritization of the identified candidate cancer genes for functional validation studies.
Allows users to detect gene fusions in human cancers in paired-end RNA sequencing (RNA-Seq) datasets. chimerascan is an open source software, developed in Python, with the aim of providing a way of investigating various RNA-Seq data collection including those containing long paired-end reads. The application includes functionalities permitting to process ambiguously mapping reads or to identify reads spanning a fusion junction. Moreover, results can be summarized through an HTML report.