Assists users in exploring data using inductive learning. Weka includes methods for inducing interpretable piecewise linear models of non-linear processes. It contains learning algorithms: (i) classifiers for both classification and regression, (ii) meta-classifiers that can improve the performance of the base classifiers, association rule learners, unsupervised learning methods (clustering) and (iii) a number of methods for pre-processing data called filters.
Allows users to store and manipulate experimental data for the purpose of numerical modeling. DataRail is an information processing system that aims to bridge the gap between data acquisition and modeling. The minimum information standard (MIDAS) is part of the DataRail system, and a series of additional tools are also applied to maintain the provenance of data and ensure its integrity through multiple steps of numerical manipulation.
Simplifies the machine learning research. Pylearn consists of a library that focuses on flexibility and extensibility to make sure that nearly any research idea is feasible to implement in the library. This tool contains a machine learning toolbox for simplify scientific experimentation. Moreover, it includes functionalities for supporting cross-platform serialization of learned models.
Assists users to perform basic object recognition, domain adaptation, fine-grained recognition, and scene recognition. DeCAF includes several features simplifying visual recognition system, and for achieving classification accuracy on tasks with sparse labeled data. This tool can be used by researchers for category detection, retrieval and discovery settings.
Allows users to work on space-efficient variable-order Markov models. VOMM is able to support a large number of context-selection criteria, scoring functions, probability smoothing methods, and interpolations. This framework aims at resolving the inability to use a large number of long contexts, and it allows one to use an amount of memory that is proportional just to the redundancy of the training data.
Extracts structured data from web forums. Vigi4Med Scraper is part of the Vigi4Med project for detecting adverse drug reactions in social networks. It is highly configurable; using a configuration file, the user can freely choose the data to extract from any web forum. The extracted data are anonymized and represented in a semantic structure using Resource Description Framework (RDF) graphs. This representation enables efficient manipulation by data analysis algorithms and allows the collected data to be directly linked to any existing semantic resource.
A web-based database providing association information of pesticides and corresponding potential targets from text mining. The database integrates the annotations for 1 347 pesticides classified into 22 groups, including physicochemical, toxicological, ecotoxicological and other related information. The potential targets for each pesticide in PTID were identified from literatures via the online text mining tool PolySearch.