Main logo
?

Bio-entity identification and normalization tools | Information extraction

Detecting mentions of bio-entities of relevance for curation, e.g. genes, proteins or small molecules, linked to unique database identifiers, such as those in UniProt, EntrezGene or ChEBI.

Bio-entity identification steps

SBT
Desktop

SBT Sequence Bloom Tree

Allows to index a set of short-read sequencing experiments and then query them…

Allows to index a set of short-read sequencing experiments and then query them quickly for a given sequence. The SBT data structure facilitates searching short-read expression experiments for…

iHOP
Web

iHOP

A network of concurring genes and proteins extends through the scientific…

A network of concurring genes and proteins extends through the scientific literature touching on phenotypes, pathologies and gene function. The iHOP system shows that distant medical and biological…

CoPub
Web

CoPub

A text mining tool that detects co-occuring biomedical concepts in abstracts…

A text mining tool that detects co-occuring biomedical concepts in abstracts from the MedLine literature database. CoPub allows batch input of multiple human, mouse or rat genes and produces lists of…

GoPubMed
Web

GoPubMed

Allows users to explore PubMed search results with the Gene Ontology (GO), a…

Allows users to explore PubMed search results with the Gene Ontology (GO), a hierarchically structured vocabulary for molecular biology. GoPubMed provides the following benefits: first, it gives an…

NLProt
Web
Desktop

NLProt

Extracts protein-names from natural language-text. NLProt is a tagging-system…

Extracts protein-names from natural language-text. NLProt is a tagging-system based on Support Vector Machines (SVMs).

KODAMA
Desktop

KODAMA Knowledge Discovery by Accuracy Maximization

A learning algorithm for unsupervised feature extraction, specifically designed…

A learning algorithm for unsupervised feature extraction, specifically designed for analysing noisy and high-dimensional datasets. KODAMA consists of two main parts: (i) the first step involves…

CIIPro
Web
Desktop

CIIPro Chemical In vitro-In vivo Profiling

A package to link chemical features and in vitro biological data with targeted…

A package to link chemical features and in vitro biological data with targeted in vivo biological activity. The CIIpro portal can automatically extract in vitro biological data from public resources…

EBIMed
Web

EBIMed

A web application that combines Information Retrieval and Extraction from…

A web application that combines Information Retrieval and Extraction from Medline. EBIMed finds Medline abstracts in the same way PubMed does. Then it goes a step beyond and analyses them to offer a…

MedPost
Desktop

MedPost

Provides a part-of-speech tagger trained on the MEDLINE corpus. MedPost accepts…

Provides a part-of-speech tagger trained on the MEDLINE corpus. MedPost accepts text for tagging in either native MEDLINE format or XML, both available as save options in PubMed. It is based on a…

IDP4+
Dataset

IDP4+

Offers an unbiased representation of mutation mention forms. IDP4+ is a corpus…

Offers an unbiased representation of mutation mention forms. IDP4+ is a corpus composed of three sub-corpora: IDP4, nala_known, and nala_discoveries.

FACTA+
Web

FACTA+ Finding Associated Concepts with Text Analysis

A real-time text-mining system for finding and visualizing indirect…

A real-time text-mining system for finding and visualizing indirect associations between biomedical concepts from MEDLINE abstracts. The system can be used as a text search engine like PubMed with…

SSBT
Desktop

SSBT Split Sequence Bloom Tree

An indexing scheme to support sequence-based querying of terabyte-scale…

An indexing scheme to support sequence-based querying of terabyte-scale collections of thousands of short-read sequencing experiments. SSBT is an improvement over the Sequence Bloom Tree (SBT) data…

The Glycan…
Web

The Glycan Miner Tool

Implements the alpha-closed frequent subtree method. The Glycan Miner Tool was…

Implements the alpha-closed frequent subtree method. The Glycan Miner Tool was able to extract a significant pattern from glycan array data. It was proved using a viral infection experiment on cells…

tmTools
Desktop

tmTools

A text-mining software tool that integrates several state-of-the-art entity…

A text-mining software tool that integrates several state-of-the-art entity tagging systems (DNorm, GNormPlus, SR4GN, tmChem, and tmVar) and offer a batch-processing mode able to process arbitrary…

Cell line…
Desktop

Cell line recognition

Cell line recognition and normalization system, supporting corpora and tagged…

Cell line recognition and normalization system, supporting corpora and tagged documents. The aim is to create corpora that is suitable for training and evaluating machine learning systems to…

PWTEES
Desktop

PWTEES PathWay Turku Event Extraction System

Extracts pathway interactions from the literature utilizing an existing event…

Extracts pathway interactions from the literature utilizing an existing event extraction tool and pathway named entity recognition (PathNER). PWTEES can be used to enrich the molecular context of…

BioIE
Web

BioIE

Allows different types of sentence extraction. BioIE employs predefined…

Allows different types of sentence extraction. BioIE employs predefined categories of interest relating to proteins and custom extraction around different entities and concepts, together with…

METIS
Web

METIS Multiple Extraction Techniques for Informative Sentences

Builds protein reports from related entries in Swiss-Prot. METIS employs data…

Builds protein reports from related entries in Swiss-Prot. METIS employs data in the Swiss-Prot entries to find relevant literature, or to find search terms with which to seek this out. It reduces…

EXTRACT
Web

EXTRACT

Helps to identify environment descriptors, organisms, tissues and diseases…

Helps to identify environment descriptors, organisms, tissues and diseases mentioned in text and to annotate these using ontology/taxonomy terms. EXTRACT consists of a server that performs the Named…

LeadMine
Desktop

LeadMine

Describes systematic chemical nomenclature. LeadMine is used for the…

Describes systematic chemical nomenclature. LeadMine is used for the identification and annotation of chemicals, protein targets, genes, diseases, species, named reactions, company names, cell lines.…

BANNER-CHEMDNER
Desktop

BANNER-CHEMDNER

Exploits unlabeled data for incorporating domain knowledge into a named entity…

Exploits unlabeled data for incorporating domain knowledge into a named entity recognition model. BANNER-CHEMDNER includes natural language processing (NLP) tasks for text preprocessing, learning…

RLIMS-P
Web

RLIMS-P Rule-based Literature Mining System for protein Phosphorylation

A rule-based information extraction system. RLIMS-P is an online text-mining…

A rule-based information extraction system. RLIMS-P is an online text-mining tool that provides an interface to identify articles relevant to protein phosphorylation, and presents information on…

medtextmining
Desktop

medtextmining

Identifies non-elliptical entity mentions in a coordinated noun phrase (NP)…

Identifies non-elliptical entity mentions in a coordinated noun phrase (NP) with ellipses. medtextmining proposes both intuitive graph-like and formal algebraic representation of a coordinated NP…

miRLiN
Web

miRLiN miRNA Literature Network

A semantic indexing method to extract relationships between terms and miRNAs…

A semantic indexing method to extract relationships between terms and miRNAs directly from the biomedical literature. miRLiN provides access to a latent semantic indexing model, which contains the…

DrugQuest
Web

DrugQuest

A text mining tool to find new associations between drugs. DrugQuest clusters…

A text mining tool to find new associations between drugs. DrugQuest clusters DrugBank records based on their textual information in a multidimensional vector space. We mainly apply partitional…

Anni
Web

Anni

An online tool to aid the biomedical researcher with a broad range of…

An online tool to aid the biomedical researcher with a broad range of information needs. Anni provides an ontology-based interface to MEDLINE and retrieves documents and associations for several…

E3Miner
Web

E3Miner

A web-based text mining tool that extracts and incorporates comprehensive…

A web-based text mining tool that extracts and incorporates comprehensive knowledge about E3s with their underlying mechanisms. E3Miner integrates available E3 data not only from the published…

BWS
Web

BWS BIOSMILE Web Search

A web-based NCBI-PubMed search application, which can analyze articles for…

A web-based NCBI-PubMed search application, which can analyze articles for selected biomedical verbs and give users relational information, such as subject, object, location, manner, time, etc. After…

biomsef
Desktop

biomsef BIOMedical Search Engine Framework

An open-source framework for the fast and lightweight development of…

An open-source framework for the fast and lightweight development of domain-specific search engines. biomsef integrates taggers for major biomedical concepts, such as diseases, drugs, genes,…

BioTextQuest+
Web

BioTextQuest+

A web-based interactive knowledge exploration platform with significant…

A web-based interactive knowledge exploration platform with significant advances to its predecessor (BioTextQuest), aiming to bridge processes such as bioentity recognition, functional annotation,…

NERsuite
Desktop

NERsuite Named Entity Recognition Suite

Simplifies research experiments. NERsuite uses various combinations of…

Simplifies research experiments. NERsuite uses various combinations of different NLP applications such as tokenizer, POS-tagger, lemmatizer and chunker to proceed. It contains three sub-functions:…

relna
Desktop

relna

Text mining tool for relation extraction of Protein to DNA and to RNA…

Text mining tool for relation extraction of Protein to DNA and to RNA interactions. Relna expands NLPBA corpus with: protein to RNA relations and protein to DNA elements. It creates method to given a…

OSCAR
Desktop

OSCAR Open-Source Chemistry Analysis Routines

A software for the recognition of named entities and data in chemistry…

A software for the recognition of named entities and data in chemistry publications. OSCAR4 can be used to identify chemical names, reaction names, ontology terms, enzymes and chemical prefixes and…

tmChem
Desktop

tmChem

An open-source software tool for identifying chemical names in biomedical…

An open-source software tool for identifying chemical names in biomedical literature, including chemical identifiers, drug brand and trade names and also systematic formats. tmChem uses conditional…

BANNER
Desktop

BANNER

A named entity recognition system intended primarily for biomedical text.…

A named entity recognition system intended primarily for biomedical text. BANNER uses conditional random fields as the primary recognition engine and includes a wide survey of the best techniques…

ChemSpot
Desktop

ChemSpot

A hybrid system for extracting chemical entities from natural language texts.…

A hybrid system for extracting chemical entities from natural language texts. ChemSpot is based on a conditional random field trained for identifying International Union of Pure and Applied Chemistry…

AIIA-GMT
Desktop

AIIA-GMT AIIA Gene Mention Tagger

A XML-RPC client of a web-service server which provides the service to…

A XML-RPC client of a web-service server which provides the service to recognize named entities in the biomedical articles.

MyMiner
Web

MyMiner

A free and user-friendly text annotation tool aimed to assist in carrying out…

A free and user-friendly text annotation tool aimed to assist in carrying out the main biocuration tasks and to provide labelled data for the development of text mining systems. MyMiner allows easy…

eFIP
Web

eFIP extracting Functional Impact of Phosphorylation

A tool to support article selection and information extraction of functional…

A tool to support article selection and information extraction of functional impact of phosphorylated proteins. The current version focuses on protein-protein interactions (PPIs) as functional…

BioInfer
Desktop

BioInfer Bio Information Extraction Resource

Provides the key types of annotation for a single set of sentences, expressing…

Provides the key types of annotation for a single set of sentences, expressing complex relationships between both physical and abstract entities. BioInfer is a public resource providing an annotated…

OpenDMAP
Desktop

OpenDMAP Open Source Direct Memory Access Parser

Advances the performance standards for extracting protein-protein interaction…

Advances the performance standards for extracting protein-protein interaction predications from the full texts of biomedical research articles. OpenDMAP is an ontology-driven, integrated concept…

eGIFT
Web

eGIFT extracting Gene Information From Text

Identifies terms and documents that are relevant to a gene and its products.…

Identifies terms and documents that are relevant to a gene and its products. Additional functionalities of eGIFT include finding terms in documents for a group of genes, finding genes sharing a…

PathNER
Desktop

PathNER Pathway Named Entity Recognition

A tool for the systematic detection of pathway mentions in the literature.…

A tool for the systematic detection of pathway mentions in the literature. PathNER is based on soft dictionary matching and rules, with the dictionary generated from public pathway databases. The…

MINOTAUR
Web

MINOTAUR MINing Online Text-A User-friendly Resource

Allows extraction of information from UniProtKB and published literature, or…

Allows extraction of information from UniProtKB and published literature, or from users' own uploaded text. MINOTAUR aims to assist users who want to search specific types of information from…

Stanford NER
Desktop

Stanford NER Stanford Named Entity Recognizer

Labels sequences of words in a text which are the names of things, such as…

Labels sequences of words in a text which are the names of things, such as person and company names, or gene and protein names. Stanford NER is based on a Monte Carlo method used to perform…

GNSuite
Web

GNSuite

Presents pre-processed input from the underlying parsing, protein recognition…

Presents pre-processed input from the underlying parsing, protein recognition and DB identifier assignment systems. Eighteen thousand full text articles are indexed by GNSuite, and more than eighteen…

GENIA Tagger
Desktop

GENIA Tagger

Analyzes English sentences and outputs the base forms, part-of-speech tags,…

Analyzes English sentences and outputs the base forms, part-of-speech tags, chunk tags, and named entity tags. The tagger is specifically tuned for biomedical text such as MEDLINE abstracts. If you…

BioLabeler
Web

BioLabeler

Extracts UMLS concepts from biomedical texts such as scientific paper…

Extracts UMLS concepts from biomedical texts such as scientific paper abstracts, experiments descriptions or medical notes and can be used to automatically curate and annotate BioMedical Literature…

Coremine…
Web

Coremine Medical

A product of the PubGene Company designed to be used by anyone seeking…

A product of the PubGene Company designed to be used by anyone seeking information on health, medicine and biology. It is ideal for those seeking an overview of a complex subject while allowing the…

tmBioC
Dataset

tmBioC text-mining BioC

Provides a BioC version of a suite. tmBioC is an annotated full-text corpus in…

Provides a BioC version of a suite. tmBioC is an annotated full-text corpus in BioC, a format detection and a conversion tool.

CoINNER
Desktop

CoINNER Co-occurrence Interaction Nexus with Named Entity Recognition

A text mining platform to distill the entity co-occurrence information from…

A text mining platform to distill the entity co-occurrence information from literature and to measure the relationships between entities using a networking approach. CoINNER allows users to identify…

CheNER
Desktop

CheNER

Presents a valid alternative for automated annotation of chemical entities in…

Presents a valid alternative for automated annotation of chemical entities in biomedical documents. The individual performance of CheNER could be further improved by expanding the dictionaries of…

Biblio-MetReS
Desktop

Biblio-MetReS

Analyzes scientific documents to find the interactions between genes/proteins…

Analyzes scientific documents to find the interactions between genes/proteins in order to reconstruct molecular networks. Biblio-MetReS relies on a central database with the genomes and gene…

FABLE
Web

FABLE

A model for tagging gene and protein mentions from text using the probabilistic…

A model for tagging gene and protein mentions from text using the probabilistic sequence tagging framework of conditional random fields (CRFs). FABLE can identify gene and protein mentions with…

ACCCA
Web

ACCCA A Combined Clinical Concept Annotator

A UIMA-based combination system for clinical record concept annotation. ACCCA…

A UIMA-based combination system for clinical record concept annotation. ACCCA can annotate clinical concepts with 3 concept types: problem, treatment, and test. Currently it combines 6 systems (ABNER…

AkaneRE
Desktop

AkaneRE

A protein-protein interaction (PPI) extraction tool to BioNLP researchers. The…

A protein-protein interaction (PPI) extraction tool to BioNLP researchers. The AkaneRE system has three parts: (i) a core engine for relation extraction (RE), (ii) a pool of modules for specific…

CALBC…
Web

CALBC Evaluation

Enables access to the different gold standard corpora (GSCs) in a standardized…

Enables access to the different gold standard corpora (GSCs) in a standardized format (IeXML). Upon submission of the annotated corpus the user has to describe the specification of the used solution…

Information

By using OMICtools you acknowledge that you have read and accepted the terms of the end user license agreement.