BioThesaurus statistics

Tool stats & trends

Looking to identify usage trends or leading experts?


BioThesaurus specifications


Unique identifier OMICS_26486
Name BioThesaurus
Restrictions to use None
Community driven No
Data access File download, Browse
User data submission Not allowed
Version 7.0
Maintained Yes


  • person_outline Hongfang Liu

Publication for BioThesaurus

BioThesaurus citations


A method for integrating and ranking the evidence for biochemical pathways by mining reactions from text

PMCID: 3694679
PMID: 23813008
DOI: 10.1093/bioinformatics/btt227

[…] A+ indexes concepts from the abstracts including genes, proteins, diseases, symptoms, drugs, enzymes and simple chemical compounds, identified using biological databases and thesauri such as UniProt, BioThesaurus, Unified Medical Language System (UMLS), KEGG and DrugBank. The user interacts with the system by issuing queries in the form of a word, a concept identifier or any Boolean combination of […]


Novel semantic similarity measure improves an integrative approach to predicting gene functional associations

BMC Syst Biol
PMCID: 3663825
PMID: 23497449
DOI: 10.1186/1752-0509-7-22

[…] d drugs), and presents them in a tabular format ranked based on co-occurrence statistics. The concept IDs and their names and synonyms are collected from several biomedical databases such as UniProt, BioThesaurus, UMLS, and DrugBank.We queried FACTA with all human protein-coding genes and stored for each query gene the set of related gene IDsc, UMLS and DrugBank terms.GoPubMed[]—developed by the T […]


The BioLexicon: a large scale terminological resource for biomedical text mining

BMC Bioinformatics
PMCID: 3228855
PMID: 21992002
DOI: 10.1186/1471-2105-12-397
call_split See protocol

[…] tatistical machine learning method. The dictionary consists of 266,000 entries for general English words extracted from WordNet [], together with 1.3 million entries for protein names, extracted from BioThesaurus []. The tool is available at mining techniques are typically evaluated against 'gold standards' []. We evaluated the NER tool […]


Moara: a Java library for extracting and normalizing gene and protein mentions

BMC Bioinformatics
PMCID: 2851609
PMID: 20346105
DOI: 10.1186/1471-2105-11-157

[…] ext and the synonyms in the dictionaries. It is flexible because the mention and the synonyms are previously pre-processed by dividing the token according to punctuations, numbers, Greek letters, and BioThesaurus terms, and finally ordering the parts of the token alphabetically. The initial lists of synonyms for the four organisms were available in the two editions of the BioCreative challenge: Bi […]


Incorporating rich background knowledge for gene named entity classification and recognition

BMC Bioinformatics
PMCID: 2725142
PMID: 19615051
DOI: 10.1186/1471-2105-10-223

[…] The dictionary was built from two resources: BioThesaurus 2.0 [] and ABGene lexicon []. To improve the dictionary coverage, we converted all the dictionary entries to lowercases, removed hyphens and tokenized the terms from non-letter and non-di […]


Integrating protein protein interactions and text mining for protein function prediction

BMC Bioinformatics
PMCID: 2500093
PMID: 18673526
DOI: 10.1186/1471-2105-9-S8-S2
call_split See protocol

[…] e, we enriched this retrieval by additional references that are contained in the BioLexicon (work in progress). The BioLexicon is a new data resource that combines the protein term repository called "BioThesaurus" with other terminological resources (e.g., NCBI taxonomy, ChEBI) and adds linguistically relevant information []. We used terms from the BioLexicon to retrieve additional references to M […]


Looking to check out a full list of citations?

BioThesaurus institution(s)
Department of Information Systems, University of Maryland at Baltimore County, MD, USA; Department of Biochemistry and Molecular Biology, Georgetown University Medical Center, Washington, DC, USA
BioThesaurus funding source(s)
Supported by grant IIS-0430743 from the National Science Foundation and in part by grant U01-HG02712 from the National Institutes of Health (for UniProt) and grant DBI-0138188 from the National Science Foundation (for iProClass).

BioThesaurus reviews

star_border star_border star_border star_border star_border
star star star star star

Be the first to review BioThesaurus