Computational protocol: Clustering RNA structural motifs in ribosomal RNAs using secondary structural alignment

Similar protocols

Protocol publication

[…] The resolved ribosomal RNA subunit structures (1S72 and 1J5E) were downloaded from PDB (). The base pairs were annotated by RNAVIEW () and MC-Annotate (). We combined (union) the annotations from both tools to generate the final annotation. The conflict predictions (different edge or orientation annotations for the same base pair) were resolved by taking the annotations from MC-Annotate. All non-canonical base pairs were temporarily discarded to reveal the general sketch of the A-form helices in the structures. Pseudo-knots were then removed using K2N web server (). Lone pairs were further removed to avoid accidental destruction of potential motifs. Finally, regions corresponding to hairpin loops, internal loops, bulge loops or junction loops () were identified from the resulting nested structures and all base pairs within these regions were recovered to construct candidate motif instances [similar to LENCS ()]. The candidate instances that contain no non-canonical base pair were removed.Candidate motif instances from 5S, 16S and 23S rRNAs were compiled into two data sets, one for hairpin loops and the other for internal loops, bulge loops and junction loops (we will call this data set internal loop data set for short). Since sequence conservation in hairpin loop motifs is also very important in defining their functionalities, higher sequence weight should be applied for this data set. The hairpin loop data set contains 33 candidate instances and the internal loop data set contains 157 candidate instances. To account for different concatenation orders the strands (), the symmetric counterpart of each motif instance in internal loop data set is also included. [...] We applied RNAMotifScan () to measure the structural similarity between two candidate motif instances. RNAMotifScan matches two motif instances by a dynamic programming approach which takes into account base pair isostericity. For the internal loop data set, the sequence weight was set to 0.2 and the structure weight was set to 0.8. while for the hairpin loop data set, we raised the sequence weight to 0.4 and lowered the structure weight to 0.6. Because the hairpin loop motifs are usually defined by their lengths (e.g. tetraloop and hexaloop), we also doubled the default gap penalty for hairpin loop clustering. Other parameters were set to default. […]

Pipeline specifications