With the advancement in structural biology and the structural genomics initiatives, the structural repertoire in Protein Data Bank (PDB) is growing rapidly. The total number of solved proteins in the PDB is >80 000, doubling the number of the entries in 2006. Nevertheless, the biological functions for many of these proteins are largely unknown. Since proteins perform their biological functions by interacting with other molecules, establishing the interaction between proteins and ligand molecules is an important step toward understanding the biological functions. In particular, the experimental solutions on the ligand–protein complexes are often used as template to deduce the ligand–protein docking and functional annotation information of other uncharacterized proteins. About one fourth of the entries deposited with the Protein Data Bank (PDB) represent proteins in complex with small molecules. The number of these ligands, in the PDB referred to as heterogeneous compounds, is currently 14 000. Binding specificity is achieved by the formation of a network of interactions between the protein and the ligand, which depends on the shape and on the physicochemical nature of the amino acids forming the binding pocket, as well as on structure flexibility of both ligand and protein.
(Yang et al., 2013) BioLiP: a semi-manually curated database for biologically relevant ligand-protein interactions. Nucleic Acids Res.
(Gallina et al., 2013) PLI: a web-based tool for the comparison of protein-ligand interactions observed on PDB structures. Bioinformatics.