Computational protocol: YAMAT-seq: an efficient method for high-throughput sequencing of mature transfer RNAs

[…] The numbers of YAMAT-seq raw reads obtained from total RNAs of MCF-7, SK-BR-3 and BT-20 with technical triplicates were shown in Figure and ; these are publicly available at NCBI's Sequence Read Archive (accession no. SRP096584). After quality check, SHRiMP2 () was used to map the reads to a set of 632 tRNA-reference genes (listed in gtRNAdb ()) that comprised 610 nuclear-encoded cyto tRNAs from the GRCh37 assembly and 22 known mt tRNAs from tRNAdb (). The 610 entries from gtRNAdb included 508 true tRNAs and 102 psuedo-tRNAs. We excluded tRNAs that mapped to contigs that are not part of the major chromosome assembly. We allowed non-unique mappings with a 10% mismatch rate, penalizing each mismatch and gap extension equally. We removed any introns from the reference set and CCA was added to the 3΄-ends of the tRNA-reference genes prior to mapping. To be conservative, we also mapped the reads (minus the CCA) to the full GRCh37 assembly using the same parameters and excluded the read if it mapped equally or better to non-tRNA space when compared with the tRNA-reference gene mapping. We confirmed that almost all reads that mapped to the tRNA reference also mapped to tRNAs during the full genome mapping. Lastly, we only kept reads that were 60–87 nt in length inclusive and ended in CCA. Because 100 nt reads, yielded by Illumina sequencing, contain 3΄-terminal sequences (13 nt) of Y-5΄-AD adapter and thereby 87 nt is the maximum read length for tRNAs, CCA sequences were not found in the reads of some long tRNAs (e.g. cyto tRNASeCUCA with 90 nt length). These long tRNA reads were retained as tRNA reads regardless of a lack of CCA sequences. Statistical analysis was performed using R () ( Heatmaps were built with the heatmap.2 function of the gplots package of R. Dendrograms were constructed and visualized with the amap and dendextend packages (), respectively, with Euclidean distance as a metric for hierarchical clustering. […]

Software tools SHRiMP, gplots, dendextend
Databases SRA GtRNAdb
Application Phylogenetics