1 - 50 of 83 results

TM-Aligner / Transmembrane Membrane proteins-Aligner

star_border star_border star_border star_border star_border
star star star star star
Assists users in aligning transmembrane proteins. TM-Aligner is a protein sequence alignment tool that provides instant result for alignment. It can perform multiple sequence alignment of unlimited number of transmembrane proteins of any length. It permits to visualize MSA in different color schemes and with variety of options. It also provides an option to select and delete sequences from final alignment.


A state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. MSAProbs can achieve high alignment accuracy at the expense of relatively long runtimes for large-scale input datasets. MSAProbs-MPI is a distributed-memory parallel version of the multithreaded MSAProbs tool that is able to reduce runtimes by exploiting the compute capabilities of common multicore CPU clusters. Our performance evaluation on a cluster with 32 nodes (each containing two Intel Haswell processors) shows reductions in execution time of over one order of magnitude for typical input datasets. Furthermore, MSAProbs-MPI using eight nodes is faster than the GPU-accelerated QuickProbs running on a Tesla K20.

PRALINE / Profile ALIgNmEnt

A toolkit for multiple sequence alignment. PRALINE provides various alignment optimization strategies to address the different situations that call for protein multiple sequence alignment: global profile preprocessing, homology-extended alignment, secondary structure-guided alignment, and transmembrane aware alignment. The software allows the sequences in the alignment to be represented in a dendrogram to show their mutual relationships according to the alignment.

SMS / STING Millennium Suite

Provides a variety of algorithms and validated data, wrapped-up in a user friendly web interface. STING Millennium Suite (SMS) is a new web-based suite of programs and databases providing visualization and a complex analysis of molecular sequence and structure for the data deposited at the Protein Data Bank (PDB). It is described in terms of a solution that brings together a number of protein analysis tools at a single web server. SMS is a very powerful tool which enables a quick estimate of the level of engagement for each amino acid within its own protein chain and functionally more importantly, in the mechanism of binding to substrate and/or inhibitor.


Performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs) reference databases to allow even faster homology extension on this important category of proteins. Aside from an MSA, the server also outputs topological prediction of TMPs using the HMMTOP algorithm. The PSI/TM-Coffee web server is part of the T-Coffee web platform; its access is free and unrestricted, without login procedure.

PVS / Protein Variability Server

Provides absolute sequence variability estimates ‘per site’ in a multiple protein-sequence alignment (MSA). PVS returns the selected reference sequence with the variable positions masked, as well as the sequence fragments containing only non-variable residues. It aims to facilitate structure-function studies and de novo epitope discovery. This tool returns results that facilitate the design of vaccines driven by epitope discovery against pathogenic organisms.

PASTASpark / Practical Alignments using SATé and TrAnsitivity

Allows to improve the performance of the alignment phase of practical alignments using saté and transitivity (PASTA). PASTASpark allows to execute PASTA on a distributed memory cluster making use of Apache Spark (a big data engine). Apache Spark permits to improve the performance of the alignment phase of PASTA. It guarantees scalability and fault tolerance, and contributes to obtain multiple sequence alignments (MSAs) from very large datasets in reasonable time.

MP-T / Membrane Protein Threader

Performs sequence to structure alignment for the homology modelling of membrane proteins. The Membrane Protein Threader is a progressive multiple alignment method specifically use on membrane protein sequences. The accuracy of this method is derived from its effective use of information about accessible surface area, membrane positioning and secondary structure. Incorporation of environment awareness into this aligner may yield even larger improvements in the quality of membrane protein models.

MIToS.jl / Mutual Information Tools for protein Sequence analysis in the Julia language

An environment for mutual information analysis. MIToS is also a framework for protein multiple sequence alignments (MSAs) and protein structures management in Julia language. It integrates sequence and structural information through SIFTS, making Pfam MSAs analysis straightforward. MIToS streamlines the implementation of any measure calculated from residue contingency tables and its optimization and testing in terms of protein contact prediction.

KMAD / Knowledge based Multiple sequence Alignment for intrinsically Disordered proteins

Aligns and annotates intrinsically disordered proteins (IDPs). KMAD augments the substitution matrix with knowledge about posttranslational modifications, functional domains, and short linear motifs. MSAs produced with KMAD describe well conserved features among IDPs, tend to agree well with biological intuition, and are a good basis for designing new experiments to shed light on this large, understudied class of proteins.

MAPGAPS / Multiply-Aligned Profiles for Global Alignment of Protein Sequences

Uses a multiple-profile alignment to ‘map the gaps’ (i.e. the insertions and deletions, both large and small) between distantly related proteins. The multiple-profile alignment serves both as a query for detecting and classifying related sequences and as a template for globally aligning the sequences to each other. Creating and maintaining multiple-profile alignments and searching with them in this way has several advantages. In particular, this facilitates rapid detection and accurate alignment of up to a million or more related protein sequences, yet is equally useful and accurate for alignment of small sequence sets.

PsychoProt / Physical CHemistry Of Protein variability

A web app and a package to study about protein modeling, evolution and design. PsychoProt helps to (i) unveil protein structure-function relationships from experiments and from alignments that are consistent with structures according to coevolution analysis, (ii) recall global information about structural and functional features and identify hitherto unknown constraints to variation in alignments, and (iii) point at different sources of variation among related experimental datasets or between experimental and alignment-based data.


Aligns amino acid sequences of variable length surrounding glycosylation sites depending on the knowledge of glycan structure. GlycoSiteAlign is an exploratory resource intended for the identification of characteristic amino acid patterns of unique glycan-protein interactions. It uses data from the UniCarbKB and UniProtKB databases and it is hosted on ExPASy, the Swiss Institute of Bioinformatics resource portal. The tool can recognize amino acid patterns and/or residues usually “diluted” or masked in alignments that take into consideration only the glycosylation type.


An online application for subgroup-free residue-level genotype-phenotype correlation. In contrast to similar methods, SigniSite does not require any pre-definition of subgroups or binary classification. Input is a set of protein sequences where each sequence has an associated real number, quantifying a given phenotype. SigniSite will then identify which amino acid residues are significantly associated with the data set phenotype. As output, SigniSite displays a sequence logo, depicting the strength of the phenotype association of each residue and a heat-map identifying 'hot' or 'cold' regions.

HANDL / Homology Assessment across Networks using Diffusion and Landmarks

Furnishes a method for incorporating proteins from different species into a shared vector space. HANDL is a standalone software based on a diffusion kernel algorithm. It aims to facilitate the detection of functional similarity across species and provides an alternative to standard sequence homology. The application can be used in conjunction with other kernels to link proteins or for evaluating network properties.


The prediction of protein coding genes is a major challenge that depends on the quality of genome sequencing, the accuracy of the model used to elucidate the exonic structure of the genes, and the complexity of the gene splicing process leading to different protein variants. As a consequence, today's protein databases contain a huge amount of inconsistency, due to both natural variants and sequence prediction errors. SIBIS is designed to detect such inconsistencies based on the evolutionary information in multiple sequence alignments. A Bayesian framework, combined with Dirichlet mixture models, is used to estimate the probability of observing specific amino acids and to detect inconsistent or erroneous sequence segments.

aSVARAP / amino acid Sequence VARiability Analysis Program

Analyses the variability along multiple amino acid sequences alignments. aSVARAP is derived from SVARAP (Sequence VARiability Analysis Program) and is dedicated to amino acid sequences. It combine several advantages: (i) easy handling and interpretation of results, which means quick training of new users, (ii) brief hands-on work (<15 min); (iii) visual interpretation of results that are plotted in graphical windows; (iv) quantification of variability, which allows statistical analysis; (v) versatility, with various targets, such as bacterial or viral genomes, and various purposes, mainly primer or probe design for polymerase chain reaction (PCR) assays or study of natural and drug-selected polymorphisms.


An efficient algorithm for multiple protein sequence alignment incorporating a new variable gap penalty function. The algorithm incorporates the information on the predicted locations of IndelFRs and the computed average log–loss values obtained from IndelFR predictors, each of which is designed for a different protein fold. A new variable gap penalty function has been proposed to make the gap placement more accurate in the protein alignment, wherein the gap opening penalty is position–specific and the gap extension penalty is region–specific. It is shown that the performance of MSAIndelFR is superior to that of the most–widely used alignment algorithms, Clustal W2, Clustal Omega, Kalign2, MSAProbs, MAFFT, MUSCLE, ProbCons and Probalign, in terms of both the sum–of–pairs and total column metrics.

Facet / feature-based accuracy estimator

Outperforms the best prior approaches to assessing alignment quality. Facet is a package that computes a single estimate of accuracy as a linear combination of efficiently-computable feature functions. It yields a parameter advisor that on the hardest benchmarks provides more than a 27% improvement in accuracy over the best default parameter choice on the hardest benchmarks. The Facet distribution includes the accuracy estimator written in Java as well as a driver script, a wrapper for PSIPRED secondary structure predictor and scripts for using Facet for aligner advising.

BaCoCa / BAse COmposition CAlculator

Identifies biases in aligned sequence data which potentially mislead phylogenetic reconstructions. BaCoCa allows a parallel determination of a suite of different statistical properties of alignments for complete concatenated amino acid and nucleotide data sets as well as for user-defined gene partitions and taxon subsets in one single process run prior to any tree reconstruction. Its results can be easily used for further analyses in programs like Excel or statistical packages like R.

GaussDCA / Gaussian Direct Coupling Analysis

A multivariate Gaussian modeling approach as a variant of direct-coupling analysis. With GaussDCA, the discrete amino-acid variables are replaced by continuous Gaussian random variables. The resulting statistical inference problem is efficiently and exactly solvable. We show that the quality of inference is comparable or superior to the one achieved by mean-field approximations to inference with discrete variables, as done by direct-coupling analysis.


A fast multiple sequence alignment algorithm. QuickProbs is a variant of MSAProbs customised for graphics processors. QuickProbs uses new, graphics processor specific, intra-task parallel algorithms for both, posterior matrix calculation and consistency transformation. Additionally, we parallelised on CPU the alignment construction and refinement stage, which in MSAProbs is performed serially. As a result, our package is several times faster than MSAProbs. This allows user to process larger datasets in a reasonable time without sacrificing alignment quality. The GPU part of QuickProbs was implemented in OpenCL, thus the package is suitable for graphics processors produced by all major vendors.

Seqmol / Sequences and molecules

Identifies stable or weak complexes by calculating Kd for formation of interfaces. SEQMOL is a protein data bank (PDB) structure analysis suite. It can be used to align multiple protein and DNA sequences, compute evolutionary attributes of multiple sequence alignments (such as sequence conservation, hydrophobicity conservation, conformational flexibility conservation, physical covariation, protein-protein interface, protein-RNA interface and protein-DNA interface propensity, and conservations thereof) and to map these features onto PDB files.

PicXAA / ProbabilistIC maXimum Accuracy Alignment

Finds the multiple sequence alignment (MSA) with the maximum expected accuracy (MEA). PicXAA is a probabilistic non-progressive alignment algorithm which takes a greedy approach to probabilistically build up the MSA, by starting from confidently alignable regions (with high similarities) and proceeding toward less confident regions (with lower similarities). By building up the MSA from the confidently alignable regions, the software reduces the risk of propagating the alignment errors made at the early stage to the final alignment.