1 - 50 of 93 results


star_border star_border star_border star_border star_border
star star star star star
Assists users in manipulating high-throughput sequencing (HTS) data and formats. Picard is a Java toolkit that provides a set of command line scripts. It comprises Java-based utilities that manipulate SAM files, and a Java API for creating new programs that reads and writes SAM files. Both SAM text format and SAM binary (BAM) format are supported. It also works with next generation sequencing (NGS).


Allows to evaluate and visualize the performance of scoring classifiers. ROCR features over 25 performance measures that can be freely combined to create two-dimensional performance curves. It uses standard methods for investigating trade-offs between specific performance measures, including receiver operating characteristic (ROC) graphs, precision/recall plots, lift charts and cost curves. The tool allows for studying the intricacies inherent to many biological datasets and their implications on classifier performance.


Fixes the rejection region in multiple hypothesis testing adjustment. SGoF uses a discriminant rule based on the maximum distance between the uniform distribution of p-values and the observed one, to set the null for a binomial test. It provides the power to detect true effects jointly with the reasonable proportion of false discoveries one should assume. Simulations suggest that the combination of SGoF+ metatest with the q-value information is an interesting strategy to deal with multiple testing issues.


Contains a set of tools displaying, analyzing, smoothing and comparing receiver operating characteristic (ROC) curves. pROC proposes multiple statistical tests to compare ROC curves, and in particular partial areas under the curve that allows proper ROC interpretation. It is based on U-statistics theory and asymptotic normality method to compare the areas under the curve (AUCs). The tool provides a consistent and user-friendly set of functions building and plotting a ROC curve, several methods smoothing the curve, computing the full or partial AUC over any range of specificity or sensitivity, as well as computing and visualizing various confidence intervals.

RAICAR / Ranking and Averaging Independent Component Analysis by Reproducibility

Improves the decomposition and interpretation of functional magnetic resonance imaging (fMRI) data with independent component analysis (ICA). RAICAR is an ICA method based on reproducibility. The software utilizes repeated ICA realizations and relies on the reproducibility between them to rank and select components. It estimates the number of components, provides the order of the components, based on component reproducibility and leads to improved data decomposition by selectively averaging across ICA realizations.

MIPReSt / Mixed ICA/PCA via Reproducibility Stability

Allows to assess component stability as the size of the data matrix changes, which can be used to determine the dimension of the non-gaussian subspace in a mixture. MIPReSt is an algorithm for mixed independent component analysis (ICA)/principal component analysis (PCA). The software uses a repeated estimations technique to rank sources by reproducibility, combined with decomposition of multiple sub-samplings of the original data matrix.


Implements a number of efficient statistical methods developed for : (i) estimating subgroup treatment effects and gene–treatment interactions, (ii) exploiting the gene–treatment independence dictated by randomization, and (iii) including the case-only estimator, the maximum estimated likelihood estimator and the semiparametric maximum likelihood estimator for parameters in a logistic model. TwoPhaseInd is an R package computationally scalable to genome-wide studies, as illustrated by an example from Women’s Health Initiative.

BICAR / Bidirectional Independent Component Averaged Representation

Allows to obtain robust, reproducible pairs of temporal and spatial components at the individual subject level from concurrent electroencephalographic and functional magnetic resonance imaging data. BICAR is an algorithm which allows to find biologically relevant paired sources involved in visual processing, motor planning, execution, and attention, which are highly reproducible and present in multiple subjects. The algorithm ranks each joint source by a task-independent measure of reproducibility.

Sicegar / SIngle CEll Growth Analysis in R

Estimates dose-response optimization and sigmoidal curve fitting. Sicegar models the two phases of growth with two sigmoidal curves that describe the relationship between time and intensity. It automates the fitting of thousands of sigmoidal and double-sigmoidal curves with minimal human supervision and the classification of measured time courses into either sigmoidal or double-sigmoidal patterns. This tool is designed to be applied on poliovirus infection and replication at the single-cell level.

MXM / Mens eX Machina

MXM is a flexible R package which offers feature selection algorithms for predictive or diagnostic models along with (Bayesian) network construction algorithms. State of the art feature selection algorithms include FBED and SES with the latter returning multiple sets of statistically equivalent variables (one of the few algorithms in the literature). The algortihms can handle many types of response variables, such as continuous, binary, multiclass, ordinal, (censored) time to event, repeated measurements, percentages etc.

OPATs / Omnibus P-value Association Tests

Permits P-value combinations by using popular analysis methods. OPATs enables a gene region to be extended upstream and downstream by a prespecified width. It can be used to identify genetic markers and marker sets associated with complex diseases and traits of interest. The tool does not require genotypic and phenotypic data in an analysis. It can be useful for analysis of P-values from different types of molecular markers in an omics study, family- and population-based association studies.


Fits thresholded logistic regression models. chngpt supports four variants of threshold regression models that are most widely used in practice. It implements both estimation and hypothesis testing functionalities and supports models with interaction terms between predictors subjected to thresholding and predictors not subjected to thresholding. This tool offers two alternative search methods: exact, which optimizes the exact criterion function, and smooth, which approximates the criterion function with a logistic function-based smooth function.

FANOVA / Functional ANalysis Of VAriance

Allows discovery of unknown gene functions. FANOVA is based on a Gaussian process. It can identify significant differential growth between the trajectories of transcription factor (TF) knockouts relative to the control strain. This tool was able to detect strong concordance of newly discovered TF functions with statistical predictions of TF gene regulatory relationships from gene regulatory network (GRN) models inferred from gene expression data alone.

ACT / Aggregation and Correlation Toolbox

Analyzes continuous signal and discrete region tracks from high-throughput genomic experiments. ACT is able to generate aggregate profiles of a given track around a set of specified anchor points, such as transcription start sites. It correlates related tracks and analyzes them for saturation. The tool takes less than a minute to generate the plot for up to 30 input files each with a few thousand lines. It provides an option to compute the coverage of a random sample of the input file combinations.

MINERVA / Maximal Information-based Nonparametric Exploration R package for Variable Analysis

Provides the mine function allowing the computation of Maximal Information-based Nonparametric Exploration (MINE) statistics. Minerva allows native parallelization: based on the R package parallel, the number of cores can be passed as parameter to mine, whenever multi-core hardware is available. The main function mine takes the dataset and the parameter configuration as inputs and returns the four MINE statistics.


Predicts the survival of cancer patients from microarray data, and classifies obese and lean individuals from metagenomic data. pensim can be applied for high-dimensional feature selection and prediction of genomic data. The tool contains a function for generating synthetic high-dimensional data with time-to-event or binary outcome, and blocks of predictor variables defined by collinearity and association with outcome, with options for introducing labeling errors and for censoring of survival times.

RERT / Representative Regression Tree

Predicts the surgical/pathological stage of the disease in a large cohort of endometrial cancer (EC) patients. RERT was developed to preoperatively identify an advanced surgical FIGO stage. It uses sHE4 and sCA125 biomarkers together with other preoperatively available clinical and pathological variables such as covariates (age, body mass index (BMI), number of children, menopause status, contraception, hormone replacement therapy (HRT), hypertension, grading from biopsy, clinical stage).

RL-SKAT / Recalibrated Lightweight Sequence Kernel Association Test

Allows exact p-value calculation score test in heritability. RL-SKAT is a computational method that can be used in the case of a single variance component and constant response vector. This process permits to speed up the analysis by orders of magnitude. This software could also be employs to answer several questions, such as (i) estimation of the underlying heritability of a phenotype, (ii) estimating the uncertainty of such estimation, (iii) phenotype prediction, and many others.

LAMP / Limitless-Arity Multiple-testing Procedure

Counts the exact number of “testable” motif combinations and derives a tighter bound of family-wise error rate (FWER), allowing the calibration of the Bonferroni factor. LAMP is a branch-and-bound algorithm. The software can be used to provide an integrated analysis of heterogeneous biological data. It was applied to human breast cancer transcriptome data and permitted to find statistically significant combinations of up to eight motifs.