Leng Han and Chunjiang He are the creators of the tissue-specific circular RNA (circRNA) database TSCD. They performed the first global analysis of tissue-specific cirRNAs and collected these data in a comprehensive database. Here, they talk about their work and how TSCD database could help researchers exploring their RNA sequencing data.
A repository of more than 300,000 tissue-specific RNAs
As circRNAs are attracting more attention in transcriptome research, we explored the global features of tissue-specific circRNAs in development and organ differentiation.
To identify tissue specific circRNAs, 3 algorithms, CIRI, circRNA_finder and find_circ, were applied on RNA-seq data collected from the ENCODE project and NCBI GEO database.
Based on the major types of circRNA, we identified more than 300,000 of tissue-specific circRNAs in different tissues. Our analysis indicated that tissue-specific circRNAs were mainly derived from exons, but may also be derived from introns or intergenic regions. The majority is generated from protein-coding genes, which suggested that these circRNAs may be associated to mRNA translation or be a backup of mRNA.
Among all circRNAs, 10.4% of human circRNAs and 34.3% of mouse circRNAs are tissue-specific, which suggested their association to tissue development. We also observed uneven distribution of tissue-specific circRNAs across different tissues. There are more tissue-specific circRNAs expressed in brain (89,137 were identified in fetal brain), and this might be owing to the complexity of neuronal activity in brain.
Abundance of TS circRNAs across different tissues:
16 adult human tissues (A), 15 fetal human tissues (B) and 9 mouse tissues (C) (in log2 of SRPTM: number of circular reads/number of mapped reads (units in trillion)/read length).
Functional enrichment analysis revealed that tissue-specific circRNAs are largely associated with tissue development and differentiation. To understand the potential functions of tissue-specific circRNAs, we identified a significant number of miRNA binding elements (MRE) and RBP (RNA binding protein) binding sites.
Finding a tissue-specific circRNA in TSCD
Users can easily browse TSCD content through browser page.
The users can view the tissue-specific circRNA by selecting :
- Human adult or fetal tissue, or mouse
- And one of the 26 individual tissues
including adipose, adrenal, blood vessel, brain, esophagogastric, esophagus, eye, female gonad, heart, intestine, kidney, liver, lung, mammary gland, pancreas, skeletal muscle, skin, spleen, stomach, testis, thymus, thyroid gland, tibial nerve, tongue, umbilical cord, and uterus.
Data organization and visualization on TSCD web interface
All of the data were organized into a set of relational MySQL tables. Customized Java and PHP scripts were used to construct the interface of database. The visualization page displayed the coordinates of each circRNA.
The index page allows the user to easily query the information of TS circRNAs by chromosome, start and end site, junction read, conservation, genomic location, etc.
Web interface of TSCD
- The users can view the comprehensive information,
as tissue category, circRNA ID, coordinates of backsplice sites, genomic locations, junction reads, strand information, genomic spanning length, gene annotation and MRE/RBP sites. - More importantly, users can visualize the details of tissue-specific circRNA through the gene symbol link. Backsplices of circRNA are represented by arcs: black arc for the non-specific circRNAs, red arc for the tissue-specific circRNA.
- The annotated exons and introns of reference transcripts are displayed in the following panel. If the reference genes have multiple transcripts, all the transcripts are displayed. If the circRNA is generated from multiple genes, the exon structures of all related genes are displayed to better illustrate the biogenesis of circRNAs. TSCD provides the tables including all precise coordinates of each backsplice of circRNA across different tissues.
Exploring tissue-specific circRNAs with TSCD
TSCD provides several pages to benefit the research community:
1) The Browser-hg38|mm10 page which displayed coordinates for each circRNA based on the latest genome version, including GRCH38 and mm10.
2) Comparison page which allowed the users to compare circRNAs among different tissues.
3) Download page which allowed the users to batch download tissue-specific circRNAs from all tissues and the customized Perl script to identify the tissue-specific circRNAs from their own RNA-seq data.
References
(Xia et al., 2016) Comprehensive characterization of tissue-specific circular RNAs in the human and mouse genomes. Brief Bioinform.