Sequence pattern around A-to-I RNA editing sites in human and mouse was delineated in two steps: 1) extracting the profile of up- and down-stream sequences (15 bases on each side) flanking editing sites using bedtools getfasta []; 2) visualizing the sequence context around RNA editing sites using WebLogo 3(weblogo –A dna –c classic –units probability –first-index −10) []., Level of expressed genes in RPKM (Reads Per Kilobase per Million mapped reads) was estimated from RNA-seq mapping results as described []. Briefly, HISAT2 [] was used to map reads on reference genomes, HTSeq [] was used to count mapped reads for expressed genes, and edgeR [] was used to perform gene differential expression analysis. Differentially expressed genes were defined as fold-change greater than 2 and false discovery rate (FDR) smaller 0.05. All RNA-seq reads were first trimmed by Trimmomatic-0.32 [] with parameters: HEADCROP = 10, SLIDINGWINDOW = 4:20 and MINLEN = 36). In addition, duplicated reads were removed by Picard., R package Venn Diagram [] was used to calculate and draw the overlapping between our identified RNA editing sites and those included in the databases: DARNED, RADAR and REDIportal. R package ggplot2 [] was used for plotting other figures. For statistics testing with distribution of A-to-I RNA editing data, the nonparametric test, Kruskal-Wallis rank sum test, was performed. For correlation analysis between ADAR gene expression and normalized RNA editing levels, the Spearman's rank correlation coefficient was computed with R.

Software tools edgeR, Trimmomatic, Picard
Databases REDIportal
Organisms Candida albicans, Homo sapiens, Mus musculus
Diseases Candidiasis, Infection, Virus Diseases, Sexually Transmitted Diseases, Viral, Retroviridae Infections, HIV Infections
Chemicals Ribonucleosides, Purine Nucleosides, Heterocyclic Compounds, 2-Ring