Similar protocols

Pipeline publication

[…] rior to miRNA predication, we masked genomic regions with less miRNA-cording possibility. This includes protein coding and non-coding RNA generating regions that are rRNAs, tRNAs, snoRNAs and lncRNAs. To be specific, coding sequences and repeat regions were masked firstly. Coding regions were obtained from the predicted gene models in the Aniseed, and repeat regions were generated by RepeatMasker and RepeatModeler (http://repeatmasker.org). Tandem repeats were masked by Tandem Repeats Finder []. Other RNAs, including rRNAs, tRNAs, snRNAs and lncRNAs were filtered using Rfam database (Release 11.0) [] by cmsearch in INFERNAL [] (E-value threshold: 1e-3). Potential tRNAs were also screened by tRNAscan-SE []., To reduce the search space, we firstly queried all metazoan mature miRNA seeds deposited in miRBase database (Release 21) [] against the masked H. roretzi genome sequences using blastn with a word size of 7 and E-value of 10. The 110 bp sequences of matched genomic regions were retrieved separately and extended for additional 20 bp from both 5′ and 3′ ends. Potential miRNAs with no more than two mismatches with known metazoan miRNAs were identified by patscan [, , ], and the candidates were folded by RNAfold [, ] subsequently. Unmatched sequences were trimmed. If more than one miRNA were mapped to the overlapping position, only the best one with the minimum folding free energy (MFE) was kept. In addition, Rfam (Release 11.0) [] covariance model (CM) was also used to search for conserved miRNA structures. Low complexity sequences were removed. We retained sequences whose minimum folding free energy (MFE) and stem-loop structure passed our filtration criteria that are mentioned below., To detect potential miRNA precursors, srnaloop [] was employed to identify hairpin structures from masked H. roretzi genome sequence. We used the parameter “ […]

Pipeline specifications

Software tools tRNAscan-SE, BLASTN, PatScan
Databases miRBase