NCBI disease corpus statistics

Tool stats & trends

Looking to identify usage trends or leading experts?


NCBI disease corpus specifications


Unique identifier OMICS_09128
Name NCBI disease corpus
Restrictions to use None
Maintained Yes


  • Primates
    • Homo sapiens


  • person_outline Zhiyong Lu

Publication for NCBI disease corpus

NCBI disease corpus citations


Disease named entity recognition from biomedical literature using a novel convolutional neural network

BMC Med Genomics
PMCID: 5751782
PMID: 29297367
DOI: 10.1186/s12920-017-0316-8

[…] g cost about 1.5 h for the NCBI corpus and 2 h for the CDR corpus.We validated the effectiveness of MCNN by applying it to two corpora containing both mention-level and concept-level annotations: the NCBI Disease corpus [] and the BioCreative V Chemical Disease Relation task (CDR) corpus []. Overall statistics for each dataset are provided in Table . The NCBI Disease corpus consists of 793 Medline […]


A method for named entity normalization in biomedical articles: application to diseases and plants

BMC Bioinformatics
PMCID: 5640957
PMID: 29029598
DOI: 10.1186/s12859-017-1857-8
call_split See protocol

[…] For diseases, the NCBI disease corpus [] was used in the present study. This corpus consists of 793 PubMed abstracts, 6 892 disease mentions, and 790 unique disease concepts using disease terms in MEDIC []. Preannotati […]


Semantic annotation in biomedicine: the current landscape

J Biomed Semantics
PMCID: 5610427
PMID: 28938912
DOI: 10.1186/s13326-017-0153-x

[…] eported in []. The study included five contemporary annotators - Whatizit, MetaMap, Neji, Cocoa, and BANNER, which were compared on three manually annotated corpora of biomedical publications, namely NCBI Disease corpus, CRAFT, and AnEM (see Table ). Evaluation on the CRAFT corpus considered 6 different biomedical entity types (e.g. species, cell, cellular component, gene and proteins), while on t […]


A neural network multi task learning approach to biomedical named entity recognition

BMC Bioinformatics
PMCID: 5558737
PMID: 28810903
DOI: 10.1186/s12859-017-1776-8

[…] The NCBI disease corpus was introduced for disease name recognition and normalization and has been applied in numerous studies of this task []. For this corpus, we select as our benchmark a result from th […]


Mapping Phenotypic Information in Heterogeneous Textual Sources to a Domain Specific Terminological Resource

PLoS One
PMCID: 5028053
PMID: 27643689
DOI: 10.1371/journal.pone.0162287

[…] The NCBI disease corpus [] consists of 793 PubMed abstracts, annotated for 6,892 disease mentions. In contrast to the other corpora compared, normalisation does not involve mapping entity mentions to conc […]


Argo: enabling the development of bespoke workflows and services for disease annotation

PMCID: 4869796
PMID: 27189607
DOI: 10.1093/database/baw066

[…] on this method, we proposed another approach which is based on the incorporation of further semantics. Firstly, two corpora, namely the official CDR corpus () provided by the track organisers and the NCBI Disease Corpus (), were used as sources of variants actually used in scientific literature which were added to our MeSH dictionary by cross-referencing provided gold standard identifiers. For exa […]


Looking to check out a full list of citations?

NCBI disease corpus institution(s)
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA; Department of Computer Science and Engineering, Arizona State University, USA

NCBI disease corpus reviews

star_border star_border star_border star_border star_border
star star star star star

Be the first to review NCBI disease corpus