Enables its users to match either a single name of a chemical compound or a whole list of names against reference databases, even when the notations are different. This matching is solely done on the name (string) basis of chemical compounds without identifying the exact chemical structure of the molecule described. An input name is normalized to a unique name form by a set of transformation rules. These rules include, among others, reordering of substituent descriptions in the name and replacement of synonymous name constituents (e.g. equivalent trivial names), as well as more simple rules dealing with different spellings, spaces, hyphens, etc. The resulting normalized term does not necessarily represent a valid or even systematic notation for a given compound but is only intended for matching two names normalized by the same methods.


A freely available compound identifier mapping service on the internet, designed to optimize the efficiency with which structure-based hyperlinks may be built and maintained between chemistry-based resources. In the past, the creation and maintenance of such links at EMBL-EBI, where several chemistry-based resources exist, has required independent efforts by each of the separate teams. These efforts were complicated by the different data models, release schedules, and differing business rules for compound normalization and identifier nomenclature that exist across the organization. UniChem, a large-scale, non-redundant database of Standard InChIs with pointers between these structures and chemical identifiers from all the separate chemistry resources, was developed as a means of efficiently sharing the maintenance overhead of creating these links.