Analyzes two types of RNA-seq: single cell data and bulk data. URSM adjusts dropout events in single cell data and achieves simultaneously deconvolution in bulk data. This software doesn’t need to calculate on the same subjects the single cell and bulk data. It can (1) obtain reliable estimation of cell type specific gene expression profiles; (2) infer the dropout entries in single cell data; and (3) infer the mixing proportions of different cell types in bulk samples.
Interprets differential expression (DE) detection of RNA-Seq experiments with a small number or non-replicated samples in each class. LPEseq evaluates the baseline error distribution for each of the compared experimental conditions. It can be used on datasets containing replicates and is also efficient for non-replicated datasets. This tool is able to remove outliers derived from the replicates assumption between classes.
Infers relative poly(A) site used in terminal exons from RNA sequencing data and KAPAC. PAQR is composed of three modules: (1) a script to deduce transcript integrity values, (2) a script to create the coverage profiles for all considered terminal exons, and (3) a script to obtain the relative usage together with the estimated expression of poly(A) sites with sufficient evidence of usage. The software enables evaluation of 3′ end processing in data sets such as those from The Cancer Genome Atlas (TCGA).
Identifies large-scale copy-number variants (CNVs) in scRNA-seq. CONICS provides a method to separate neoplastic cells for downstream analysis. It includes algorithms to triage cells from a scRNA-seq assay, based on the presence of CNVs detected in an orthogonal DNA sequencing experiment. It integrates tumor-normal fold-changes with the minor-allele frequencies of point mutations to estimate false-discovery rates (FDRs) in CNV classification. Additionally, it includes routines to perform downstream phylogeny assessment and gene co-expression analysis.
Characterizes circRNAs candidates. FUCHS provides the user with directions for further steps to investigate the circRNA’s function and biogenesis. FUCHS is able to identify alternative exon usage within the same circle boundaries, summarize the different circles emerging from the same host-gene, quantify double-breakpoint fragments as indicator for circularity and visualize a circRNA’s read coverage profile independent of any genome browser.
Estimates 3’ untranslated region (UTR) landscape from RNA-seq. GETUTR has three steps: (1) preprocessing for the extraction of all reads in RNA-seq data, (2) smoothing via algorithms and (3) normalization applied for all genes. Three smoothing algorithms that were tested on their average lengths of 3’ UTR and on the prediction of polyadenylation cleavage site (PCS) are available through this software.
Programs search nucleotide databases by using a nucleotide query. BLASTN key features are searching with short sequencing and cross-species comparison. Users can select an optimization according to: (i) highly similar sequences, (ii) more dissimilar sequences or (iii) somewhat similar sequences. This web application proceeds by searching sets in NCBI data sources.
Assists users to observe DNA and protein sequence data from different species and populations. MEGA is composed of several tools allowing researchers to work on phylogenomics and phylomedicine. This repository includes features aiming to determine gene duplication events in gene family trees. Moreover, this tool is available through a graphical user interface (GUI) and a command line interface.
Searches protein database using a translated nucleotide query. BLASTX is a BLAST search application that compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence database. This application can also work in Blast2Sequences mode and can send BLAST searches over the network to public NCBI server if desired.
Detects head-to-tail spliced (back-spliced) sequencing reads, indicative of circular RNA (circRNA) in RNA-seq data. find_circ is a pipeline that can find circRNAs in any genomic region. It takes advantage of long (,100 nucleotides) reads, and predicts the acceptor and donor splice sites used to link the ends of the RNAs. This method provides evidence that circRNAs form an important class of post-transcriptional regulators.
Assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. Cufflinks assembles individual transcripts from RNA-seq reads that have been aligned to the genome. This software is able to infer the splicing structure of each gene because reads from multiple splice variants for a given gene can be found in a sample. Quantification of transcript abundances is also possible by preferring a reference annotation to assembling the reads.
Focuses on variant discovery and genotyping. GATK provides a toolkit, developed at the Broad Institute, composed of several tools and able to support projects of any size. The application compiles an assortment of command line allowing one to analyze of high-throughput sequencing (HTS) data in various formats such as SAM, BAM, CRAM or VCF. The website includes multiple documentation for guiding users.
Aligns single cells from differentiation systems with bifurcating branches. Wishbone pinpoints bifurcation points and labels each cell as pre-bifurcation or as one of two post-bifurcation cell fates to order cells according to their developmental progression. It is generalizable to additional lineages, as it was demonstrated by applying it to mouse myeloid differentiation. The tool outperforms methods developed specifically for single cell RNA-seq data.
Allows to find regions of sequence similarity. PSI-BLAST is a protein database search program. The software is able to access the probable substitutions at each sequence position using the results of a previous Gapped-Blast search, an algorithm comparing the amino acid substitution matrix. It can combine search results with robust statistics to build and apply profiles also known as a position-specific scoring matrix. A modified application of PSI-BLAST - PSI-BLASTexB - that solves sequence weighting scheme limitations, was also developed.
Allows studying of spatial patterning of gene expression at the single-cell level. Seurat is an R package that enables quality control (QC), analysis, and exploration of single cell RNA-seq data. The software includes three computational methods: (1) unsupervised clustering and discovery of cell types and states, (2) spatial reconstruction of single cell data, and (3) integrated analysis of single cell RNA-seq across conditions, technologies, and species. It can also localize rare subpopulations, and map both spatially restricted and scattered groups.
Performs factor analysis on suitable sets of control genes or samples. RUVSeq furnishes estimations of expression fold-changes. This package implements the remove unwanted variation (RUV) methods for the normalization of RNA-Seq read counts between samples.
Allows users to quantify abundances of transcripts from RNA-Seq data and target sequences using high-throughput sequencing (HTS) reads. kallisto is based on pseudo-alignment concept to determine the compatibility of reads with targets. In test, this tool is able to treat over 30 million human reads using the read sequences and a transcriptome index.
Allows users to measure changes in mature RNA and pre-mRNA reads across different experimental conditions to quantify transcriptional and post-transcriptional regulation of gene expression. EISA reveals both transcriptional and post-transcriptional contributions to expression changes, aiming to increase information that can be gained from RNA-seq data sets. Moreover, this tool can be used for studying transcriptome changes.
Performs peak finding and downstream data analysis for next-generation sequencing analysis. HOMER affords several tools and methods to make use of ChIP-Seq, GRO-Seq, RNA-Seq, DNase-Seq, Hi-C and other types of functional genomics sequencing data sets. This software offers support to UCSC visualization, peaks annotation, quantification of transcripts and repeats or differential features, enrichment and expression.
Consists of a terminology designed for RNA sequencing. ORNASEQ is based on the ontology for biomedical investigations (OBI). It supplies a list of about 160 terms, some of the terms are from several existing ontologies, and more than 20 terms that have been added to OBI. This ontology is useful for the annotation of RNA-based next-generation sequencing and DNA-based next-generation sequencing data.
Provides access to the genomic alignments of public ribo-seq reads in conjunction with mRNA-seq reads along with relevant annotation tracks. GWIPS-viz is a specialized ribo-seq browser allowing researchers to support ribo-seq evidence for alternative proteoforms inferred from phylogenetic analysis or detect with proteomics or other experimental techniques. It can be used as a support tool for predictions based on other approaches and for generating hypotheses that can be tested using methods other than ribo-seq.
Gathers human long poly-adenylated RNA transcripts derived from computational analysis of high-throughput RNA sequencing (RNA-Seq) data. MiTranscriptome provides a set of about 6,500 libraries including datasets from human tissues and samples from cell lines. The tissue libraries originate from primary tumor specimens, metastases, and normal or benign adjacent tissues.
Offers a reference sequence of chromosome 3B. Wheat3BMine is useful to delineate structural and functional features along a chromosome and to establish correlations between recombination intensity, gene density, gene expression, and evolution rate. It provides genomic annotation information of the wheat 3B survey such as gene, mRNA, polypeptide or repeat region. This database is searchable by names, identifiers or keywords related to genes, mRNA, repeat region or marker.
Provides resources to decode Pan-Cancer and Interaction Networks of lncRNAs, miRNAs, competing endogenous RNAs(ceRNAs), RNA-binding proteins (RBPs) and mRNAs from large-scale CLIP-Seq data and tumor samples. starBase deciphers Protein-RNA and miRNA-target interactions, such as protein-lncRNA, protein-sncRNA, protein-mRNA, protein-pseudogene, miRNA-lncRNA, miRNA-mRNA, miRNA-circRNA, miRNA-pseudogene, miRNA-sncRNA interactions and ceRNA networks from 108 CLIP-Seq datasets.
Hosts multiples datasets dealing with human and mouse skeletal muscle cells and tissue. SKmDB gathers features allowing the query, the visualization and the downloading among more than 100 different datasets. The repository gives access to information dealing with CLIP-seq, miRNA-seq, small RNAseq, single cell RNA-seq, RNA-seq, ChIP-seq, AIMS-seq, DNase-seq, ATAC-seq, MNase-seq as well as Bisulfite-seq data, gene expression, co-expression subnetwork or hotspot regions.
Provides a comprehensive and tissue-specific plant circular RNA database. AtCircDB is an online resource for predicted and validated Arabidopsis hosting circular RNA candidates identified from largescale sequencing data. This database currently hosts four categories of information: (i) circular RNA information, (ii) potential miRNA–circular RNA interaction, (iii) super circular region and (iv) tissue information.
A comprehensive portal for blood-brain barrier transcriptomics data, obtained by sequencing mRNA (mRNA-seq) and microRNA (miRNA-seq) of polarized hCMEC/D3 cell monolayers. This data encompasses coding (gene expression, alternate splice forms, expressed single nucleotide variants -eSNVs) and non-coding (microRNA, LincRNA, circular RNA) counts that are easily accessible through BBBomics hub database. We also superimposed the RNA-seq coding data on 285 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, which include canonical, non-canonical, and/or atypical pathways retrievable using BBBomics hub.
Provides RNA-RNA interactions (RRIs) identified through high-throughput sequencing technologies. RISE is a comprehensive database of RNA interactome from sequencing experiments. It includes (i) comprehensive curation of RRIs, (ii) a large dataset of RRIs among mRNAs and lncRNAs, (iii) details of the interacting sites and (iv) extensive annotations for each RRI. It provides an assistance for researchers looking for interaction and other functional information on individual RNAs, and analyzing RRI networks of specific pathways or systems.
Gathers expression correlations between microRNAs and mRNAs by using total RNA sequencing (RNA-seq) experiments from NCBI’s Sequence Read Archive (SRA). mirCoX is an online repository integrating sequence-based miRNA target predictions from miRanda and TargetScan databases together with RNA-seq derived expression correlations. Its online interface allows users to browse by gene name or microRNA name and download information about miRNA, gene, correlation.
A web-based repository of RNA-Seq gene expression profiles and query tools. The website offers open and easy access to RNA-Seq gene expression profiles and tools to both compare tissues and find genes with specific expression patterns. To enlarge the scope of the RNA-Seq Atlas, the data were linked to common functional and genetic databases, in particular offering information on the respective gene, signaling pathway analysis and evaluation of biological functions by means of gene ontologies. Additionally, data were linked to several microarray gene profiles, including BioGPS normal tissue profiles and NCI60 cancer cell line expression data. Our data search interface allows an integrative detailed comparison between our RNA-Seq data and the microarray information.
Provides a comprehensive high-quality reference transcript dataset about Arabidopsis transcripts. AtRTD contains more than 82 190 unique transcript models. It was generated by integration of transcript assemblies of ca. 8.5 billion pairs of reads from 285 RNA-seq data sets obtained from 129 RNA-seq libraries. The database contains 37 137 events and those which occurred at least 50 times made up 95.24% of all alternative splicing events.
Gathers long RNA species derived from RNA-seq data analyses of human blood exosomes. exoRBase is a manually curated database which allows integration and visualization of RNA expression profiles spanning normal individuals and patients with different diseases. Besides, users can extract RNAs of interest through customized browsing options. The database includes about 15000 IncRNAs, 18000 mRNAS and 58000 circRNAs.
Provides access to processed and curated NGS experiments, including ChIP-Seq (transcription factors and histones), RNA-Seq and DNase-Seq. The current focus of this database is to unify NGS data for the haematopoietic system and ES cells. It encompasses two specialized compendia: one focused on blood cells (HAEMCODE), and a second focused on data from embryonic stem (ES) cells (ESCODE).
Aims to characterize the regulatory networks between RNA binding proteins (RBPs) and various RNA transcript classes by integrating large amounts of CLIP-seq (including HITS-CLIP, PAR-CLIP and iCLIP as variations) data sets. CLIPdb 1.0 consistently annotated the CLIP-seq data sets and RBPs, and provides a user-friendly interface for quick navigation of the CLIP-seq data.
Contains sperm-borne RNA profiling expression data for mouse, rat, rabbit, and human. SpermBase provides large and small RNA expression data, total sperm and sperm heads. It will be expanded to other species such as plants. The database has been constructed on RNA-Seq analyses. The utility of SpermBase was shown by comparing the sperm RNA-Seq data and identifying highly conserved mammalian sperm-borne RNAs among the four mammalian species.
Aims to generate comprehensive RNA-seq data from a wide variety of non-human primates (NHPs), from lemurs to hominids. This resource will continue to host additional RNA-Seq data, alignments and assemblies as they are generated over the coming years and provide a key resource for the annotation of NHP genomes as well as informing primate studies on evolution, reproduction, infection, immunity and pharmacology.
Aligns the ribo-seq data to the transcriptome. HRPDviewer is a database that collects more than 600 published human ribo-seq datasets from several studies in human. It provides visualization of the ribo-seq data on the selected mRNA transcripts. Users can compare and visualize the ribo-seq data mapped on different mRNA transcripts under different physiological conditions.
Offers the access of over 2000 human samples. IRBase offers an online database of intron retention (IR). This resource permits to assess a specified intron retention event within a tissue/cell type with a gene symbol and a tissue/cell type, or assess genome-wide intron retention within an RNA-seq dataset with an RNA-seq dataset. IRBase is a part of the toolbox developed by the CNRS to study the impact of IRintron retention on gene regulation.
Provides a manually curated database of mouse RNA-Seq datasets. RBPMetaDB is a resource that includes the metadata of perturbed RNA-binding proteins (RBPs). It allows users to access all the key information related to the curated RNA-Seq datasets, including the GEO/ArrayExpress accession numbers, dataset titles, numbers of samples, associated RNA-binding proteins (RBPs), perturbation types, and PubMed IDs.