Transcriptome annotation software tools | RNA sequencing data analysis
RNA-Seq has effectively portrayed the transcriptional complexity in eukaryotes demonstrating the widespread transcription of lncRNAs in a diverse group of organisms. However, annotation of de-novo generated transcriptomes, remains a complex task requiring efficient management of large datasets.
Permits functional annotation, management, and data mining of novel sequence data. Blast2GO is based on the utilization of common controlled vocabulary schemas, the gene ontology (GO). It takes in consideration similarity, the extension of the homology, the database of choice, the GO hierarchy, and the quality of the original annotations. This tool is suitable for plant genomics research. It generates functional annotation and assesses the functional meaning of their experimental results.
Offers a component-based architecture that allows users to add new functionality in the form of plug-in modules. geWorkbench includes many computational resources permitting to delete many steps that require programming skills. It simplifies the utilization of multi-module analysis pipelines. The tool’s modules consist of wrapped versions of pre-existing third-party software tools.
Offers a platform for the detection of genomic features into transcripts from next generation RNA sequencing data. RNA-eXpress provides a graphic user interface (GUI) dedicated to the identification of splice variants, transcription start sites, UTRs, introns as well as non-coding RNA features. Users can run feature annotation, comparison, sequence extraction and read counting. The application can supply results as summary statistics, histograms or pie charts.
An easy-to-use application for microarray, RNA-Seq and metabolomics analysis. For splicing sensitive platforms (RNA-Seq or Affymetrix Exon, Gene and Junction arrays), AltAnalyze will assess alternative exon (known and novel) expression along protein isoforms, domain composition and microRNA targeting. In addition to splicing-sensitive platforms, AltAnalyze provides comprehensive methods for the analysis of other data (RMA summarization, batch-effect removal, QC, statistics, annotation, clustering, network creation, lineage characterization, alternative exon visualization, gene-set enrichment and more).
Allows users to process transcriptomes from animals, plants, fungi, and bacteria. TRAPID is a web and high-throughput analysis application using predefined reference databases. It offers an online interface to characterize assembled transcript sequences and to initiate comparative genomics analyses. It enables scientists with a biological background to explore their non-model transcriptome data. The analysis process includes: the automatic identification of coding sequences in transcripts, correcting frameshifts, assigning coding-sequences to multi-species gene families, performing transcript quality control, and generating functional annotations.
A heuristic sequence alignment tool for comparing a cDNA sequence with a genomic sequence containing a homolog of the gene in another species. sim4cc is built on the foundation of sim4 and incorporates several techniques that make it suitable for cross-species comparisons.
Serves for processing RNA-Seq data. easyRNASeq is a program that combines the necessary packages in a single wrapper that ensures the pertinence of the provided data and information. It also assists users to circumnavigate RNA-Seq processing pitfalls. Moreover, it introduces functionalities to handle data produced by recent next-generation sequencing (NGS) protocols.
Aims to ease high-throughput sequencing (HTS) data analysis by the using of distributed computation. Eoulsan is a framework able to perform its tasks on distributed computers. The application includes batch analyses, a full automation process managing external file locations and distributed file system. It can be run according three modes: standalone, local cluster or cloud computing on Amazon Elastic MapReduce.
Combines the annotation of protein coding transcripts with the prediction of putative lncRNAs in whole transcriptomes. Annocript downloads and indexes the needed databases, runs the analysis and produces human readable and standard outputs together with summary statistics of the whole analysis.
Allows users to analyze cross-platform and cross-species microarray data. iArray employs a meta-analysis approach to derive expression patterns from individual microarray dataset and to discover patterns frequently occurring across multiple datasets. It can be used to identify conserved expression patterns across different species. Furthermore, this tool includes a data preprocessing module, a co-expression analysis module, a differential expression analysis module, a functional and transcriptional annotation module and a graphical visualization module.
A tool designed for the organization and analysis of large sequence datasets, and especially suited for transcriptome projects of non-model organisms with little or no a priori sequence information. T-ACE offers a TCL-based interface, which accesses a PostgreSQL database via a php-script. Within T-ACE, information belonging to single sequences or contigs, such as annotation or read coverage, is linked to the respective sequence and immediately accessible. Sequences and assigned information can be searched via keyword- or BLAST-search. Additionally, T-ACE provides within and between transcriptome analysis modules on the level of expression, GO terms, KEGG pathways and protein domains.
Aims to characterize non-model organisms. IDP-denovo allows users to handle transcriptome data without a reference genome. It offers a platform that consists of three main features: (i) it assembles hybrid sequencing data; (ii) it annotates gene isoform structures and alternative splice sites and; (iii) it quantifies isoform abundance by using both short (SRs) and long reads (LRs). The software is suited to be incorporated into automated pipelines.
Permits to process and analyze the raw files generated with the Spatial Transcriptomics (ST) method. ST Pipeline enables demultiplexing of spatially-resolved RNA-seq data and robust quality filtering and identification of unique molecules. It is highly customizable with numerous parameter settings. The tool is more robust, efficient and scales better to arrays with higher density. It filters data, aligns it to a genome, annotates it to a reference, demultiplexes by array coordinates and then aggregates by counts that are not duplicates using the Unique Molecular Identifiers.
Offers functions for parameter optimization and transcriptome prediction. FIT achieves comparable or better prediction performance within a shorter computational time than the previous method. It will facilitate the study of the environmental effects on transcriptomic variation in field conditions. The tool can be applicate to data collected by other sampling strategies, such as multiple treatments at a single time point, only by preparing meteorological data of sufficient length.
Allows users to annotate de novo transcriptomes. annotatingTranscriptomes is a custom Perl script allowing to annotate with gene names, Gene Ontology (GO) terms, EuKaryotic Orthologous Groups (KOG) term and KEGG terms. The software also provides solutions for assessing the transcriptome's quality metrics.
Enables access and manipulation of gene models and other annotations. GenomicFeatures is a set of tools and methods that allows to make and manipulate transcript centric annotations. The software provides an automated mechanism for constructing a TranscriptDb object from tracks defined in the UCSC genome browser, Biomart, or GTF/GFF files.
Sorts multiple FASTA format sequence file from de novo assembly. Bag2D aims to build a transcriptome database for the analysis of RNA-Seq data on the lipid-rich mutant. It can characterize the lipid-rich microalgal mutant on the transcriptomic level. This tool can conduct functional assignment for the unknown contigs. It can be applied to any organisms without complete transcriptome information.
Provides comprehensive functional annotations. FunctionAnnotator is an annotation web-server that includes annotations for Gene Ontology (GO) terms, enzyme identification, domain detection, lipoprotein recognition, transmembrane domain discovery, subcellular localization annotation. It also provides the distribution of species from best hits at different taxonomic levels. It also discloses species distribution, functions for transcripts and all of the activated pathways hidden in the metatranscriptomic data.
Supports the validation of experiments by annotating variants and prioritizing cases. KissSplice2RefGenome was developed to map KisSplice path on the reference genome using STAR. This method provides for each bubble: the gene name, the alternative splicing (AS) event type, the genomic coordinates and the list of splice sites used (novel or annotated).
Assigns functional annotations. FastAnnotator efficiently annotates sequences with their gene functions, enzyme functions or domains. It is useful in transcriptome studies and especially for those focusing on non-model organisms or metatranscriptomes. The tool integrates several well-developed annotation tools together to provide annotations for query sequences. It is capable of efficiently annotating sequences and is suitable for annotation of sequences derived from less well-studied organisms or environmental samples.