Similar protocols

Protocol publication

[…] iminating power than calculating correlations between DDH and GGD. However, because those programs and settings that were optimal with Kendall's tau were also optimal regarding the error ratio, correlation can well be used to select the best programs and parameters. The situation is more complicated regarding the optimal GBDP distance function, which is not necessarily the same regarding correlation and error ratio. Therefore, a prediction of whether a DDH value would be at least as large as 70% can be based on the GGD thresholds resulting in the lowest error ratios., Interestingly, the more sensitive programs or settings are not necessarily those with the highest correlation. For instance, MUMmer performs best with moderate minimum match lengths (). Likewise, BLAT shows a higher correlation with the default tile size of 11 bp than with the supposedly more sensitive 8 bp. Moreover, WU-BLASTN, which usually results in larger sets of HSPs, including much shorter HSPs (personal observation), is outperformed by NCBI-BLASTN, and works better when filtering is applied (). These results may be caused by the loss of information inherent to the DDH approach itself, which would explain why a corresponding loss of information caused by less sensitive settings for GGD calculation causes an increase in the correlation. The conclusion that DDH is imprecise is confirmed by the comparison with the 16S rRNA data – otherwise it would be hard to explain why GGD (and ANI) show a significantly higher correlation with 16S rRNA distances than does DDH. Accordingly, there may be inherent difficulties in obtaining a perfect correspondence between GGD and DDH because of the imprecision of the latter [,]. Thus, the high corre […]

Pipeline specifications

Software tools MUMmer, BLAT, BLASTN