EMU specifications


Unique identifier OMICS_09112
Name EMU
Alternative name Extractor of MUtations
Software type Package/Module
Interface Command line interface
Restrictions to use None
Operating system Unix/Linux
Computer skills Advanced
Stability Stable
Maintained Yes


The SNPcurator: literature mining of enriched SNP disease associations

PMCID: 5844215
PMID: 29688369
DOI: 10.1093/database/bay020

[…] () also extracts mutations based on conditional random fields. Open Mutation Miner (OMM) () uses MF to recognize single mutations and extends its regular expression set to detect mutation series. The extractor of mutations (EMUs) () detects mutations in text and links them to genes, proteins and diseases. The SNP Extraction Tool for Human Variations (SETH) () implements an Extended Backus–Naur For […]


Recent advances in predicting gene–disease associations

PMCID: 5414807
PMID: 28529714
DOI: 10.12688/f1000research.10788.1

[…] AMT, Amazon Mechanical Turk; API, application programming interface; EMU, Extraction of Mutation; GWAS, genome-wide association studies; HIT, human intelligence task; RVS, Reference Variant Store; SNP, single nucleotide polymorphism. […]


Text Mining Genotype Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine

PLoS Comput Biol
PMCID: 5130168
PMID: 27902695
DOI: 10.1371/journal.pcbi.1005017

[…] relationships from literature for the purpose of investigating the impact of intergenic (non-coding) variants []. Of all the works on mutation relationship extraction, one of the most notable is the EMU tool developed by Doughty et al []. EMU provides a semi-automated approach to extract disease-related mutations from PubMed abstracts and full text. This work, which truly addresses broad genotype […]


Hybrid curation of gene–mutation relations combining automated extraction and crowdsourcing

PMCID: 4170591
PMID: 25246425
DOI: 10.1093/database/bau094
call_split See protocol

[…] n in . The software combines multiple modules to support the following functions: automated information extraction, using the NCBI’s GenNorm and the University of Maryland Baltimore County’s (UMBC’s) EMU; linkage of multiple mentions of each unique gene and each mutation; display of each candidate gene and mutation pair, highlighted in context in the abstract; crowdsourced relation judgment (via A […]


Mutation extraction tools can be combined for robust recognition of genetic variants in the literature

PMCID: 4176422
PMID: 25285203
DOI: 10.5256/f1000research.3422.r3233

[…] corpora exist annotated with protein residue information either manually annotated , or prepared using automatic methods .There are corpora available that contain both protein and DNA mutations. The EMU corpus ( http://bioinf.umbc.edu/EMU/ftp) was developed for annotation of mutations related to prostate cancer. This data set was developed by querying MEDLINE for the medical subject heading (MeS […]


Literature mining of genetic variants for curation: quantifying the importance of supplementary material

PMCID: 3920087
PMID: 24520105
DOI: 10.1093/database/bau003

[…] e HT articles (79% of referenced articles) contain <94% of the mutations in the COSMIC database (Mut Recall) and account for the high average number of mutations in COSMIC. In , we see the results of EMU over the COSMIC HT and COSMIC NHT subsets, respectively. The NHT group shows a much larger recall compared with the HT group, supporting the hypothesis that the HT articles pose a particular chall […]


University of Maryland, Baltimore County, Baltimore, MD; Division of Imaging and Applied Mathematics, Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD; National Library of Medicine, Bethesda, MD, USA

