Computational protocol: The Evolution of Tyrosine-Recombinase Elements in Nematoda

Similar protocols

Protocol publication

[…] In order to find YREs in the assemblies we used a strategy modified from Piednoël et al. (). First, we searched for YR domains in each whole genome assembly. YR matches were extended by 10 kb in each direction or to the contig end, whichever was encountered first. We then searched for RT and MT domains and direct and inverted repeats in the resulting sequences. This approach efficiently streamlined the homology searches while including only RT and MT domains that are likely to belong to YREs. The homology searches were conducted using PSITBLASTN , with an expected value threshold of 0.01. The query models for these searches were seeded with the alignments from Piednoël et al. and were extended by adding protein sequences from the reference dataset through PSIBLASTP search , .Direct and inverted repeats on the extended YR fragments were detected with the BLAST based program UGENE , with only identical repeats at least 20 bp long allowed. These values represent the minimal repeat sequence in the results of Piednoël et al. . Each annotated fragment was subsequently programmatically given a preliminary classification based on its similarity to the structures illustrated in . [...] For the inference of phylogenetic relationships among YRE clades we considered only YRE matches that had at least YR and RT domains as well as terminal repeats. The RT domain may have had a different history from that of the YR domain as published YR and RT trees do not seem to be congruent , . Therefore, a reciprocal AU-test for partition homogeneity was conducted in CONSEL 0.2 , using a RT, YR and combined datasets with identical YRE representation. Since the results indicated incongruence between the partitions (see and ), and since preliminary analysis revealed better sh-like support values in the tree that was reconstructed from the RT dataset, the RT domain was chosen for the phylogenetic reconstruction of YRE relationships (). Gypsy, Copia and BEL sequences from Repbase were added to the RT dataset prior to the analysis. The RT sequences were aligned with MAFFT 7 , using default settings and then trimmed with TrimAl 1.2 to remove positions with over 0.3 gap proportion. The tree was reconstructed using FastTree 2.1.7 with gamma distribution of among site rate variation and with the JTT matrix of substitution rates. SH-like values were used as branch support, as they have been found to be highly correlated with bootstrap approaches and are rapidly calculated (see for the exact command line parameters used). […]

Pipeline specifications

Software tools Unipro UGENE, CONSEL, MAFFT, trimAl, FastTree
Applications Phylogenetics, Sanger sequencing
Organisms Caenorhabditis elegans
Diseases Nematode Infections
Chemicals Tyrosine