Computational protocol: Hidden Markov Models for Evolution and Comparative Genomics Analysis

Similar protocols

Protocol publication

[…] Protein sequences of Gram-negative bacteria were downloaded from GeneBank release 175 (ftp://ftp.ncbi.nih.gov/genomes/Bacteria/). Orthologous protein groups were downloaded from the NCBI Protein Clusters database release Jan 2010 (ftp://ftp.ncbi.nih.gov/genomes/CLUSTERS/). The data set contained 717455 proteins from 593 genomes. Multiple alignments were constructed by Muscle . Protein phylogenetic trees were created using the protdist and neighbor programs in the PHYLIP package . Signal peptide scores were calculated by SingalP 3.0-NN . In the evolutionary analysis, we considered a subset of orthologous clusters where different discrete predictions of signal peptides were present. […]

Pipeline specifications