Promote the development of biomedical text mining applications. BioCreative works closely with biocurators to understand the various curation workflows, the Text Mining (TM) tools that are being used and their major needs. One of the aims of the BioCreAtIvE challenge is to determine the state of the art for a given task in biomedical text mining. This can be achieved if a considerable number of participants from a given community participates and the provided results of each system is evaluated by domain experts using well defined evaluation metrics. To address the barriers in using TM in biocuration, BioCreative has been conducting user requirements analysis and user-based evaluations, and fostering standards development for TM tool re-use and integration.
An API for biomedical concept identification and a web-based tool that addresses these limitations. MEDLINE abstracts or free text can be annotated directly in the web interface, where identified concepts are enriched with links to reference databases. Using its customizable widget, it can also be used to augment external web pages with concept highlighting features. Furthermore, all text-processing and annotation features are made available through an HTTP REST API, allowing integration in any text-processing pipeline.
Permits annotation thank to ontology terms. OnASSIs is able to compute semantic similarity measures based on the structure of the ontology between different annotated samples. It allows users to retrieve concepts from OBO ontologies in a given text with different options. This tool offers the possibility to annotate Gene Expression Omnibus (GEO) metadata for stored experiments and samples.
Identifies negation in textual medical records. NegEx implements several phrases indicating negation, filters out sentences containing phrases that falsely appear to be negation phrases, and limits the scope of the negation phrases. It enables lexical representations in other languages. The tool can be used to scan the documents and charts of an outgoing patient to make sure that the doctors haven’t missed anything important. It was translated to Swedish, French, and German and compared on corpora from each language.
Receives and edits batches of abstracts in standard North American Association of Central Cancer Registries (NAACCR) format into the central registry. Prep Plus is a program that can run in file-server or client-server mode and stores tracking information in a database. The software can handle abstracts created by any software system. It allows edition of abstracts and presentation of cases individually for correction, as well as generation of error report and visual edition of cases.
Allows semantic disambiguation via approximate string matching. SimSem exploits a collection of strings such as dictionaries, LibLinear as its machine-learning component and SimString for fast approximate string matching. It uses semantic category disambiguation (SCD) for the assignation of the appropriate semantic category. This tool is applicable with manual annotation support tasks and can be used as a high-recall component in text processing pipelines.
A Web-based tool for accelerating manual literature curation (e.g. annotating biological entities and their relationships) through the use of advanced text-mining techniques. As an all-in-one system, PubTator provides one-stop service for annotating PubMed citations.