Allows feature selection in RNA-Seq as well as genome-wide association study (GWAS) data. ReliefSeq consists of an extension and assessment of Relief-F for feature selection in RNA-seq data. The software adapts the number of nearest neighbors (k) for each gene to optimize the Relief-F test statistics (importance scores) for finding both main effects and interactions. It supports multi-class and continuous phenotypes.
Improves the predictive performances of ordinary logistic ridge regression and the group lasso. GRridge allows the use of multiple sources of co-data (e.g. external p-values, gene lists, annotation) to improve prediction of binary, continuous and survival response using (logistic, linear or Cox) group-regularized ridge regression. It also facilitates post-hoc variable selection and prediction diagnostics by cross-validation using ROC curves and AUC.
Provides functions to perform ensemble minimum redundancy maximum relevance (mRMR) feature selection by taking full advantage of parallel computing. mRMRe can be beneficial from both a predictive (lower bias and lower variance) and biological (more thorough feature space exploration) point of view. This makes it particularly attractive for high-throughput genomic data analysis. This package contains a set of function to compute mutual information matrices from continuous, categorical and survival variables.
Chooses a diverse panel of genomic assays that leverages methods from submodular optimization. SSA serves as a model for how submodular optimization can be applied to other discrete problems in biology. This method is computationally efficient, results in high-quality panels according to several quality measures, and is mathematically optimal under some assumptions. It can be used partway through the investigation of a cell type, when several assays are already available. The tool can determine the most informative next experiments to perform.
Enables selection of variables of the observation matrix of an experiment with one or multiple conditions, by using a hidden Markov model. It is able to summarize samples observations or to receive a priori information to get a customized model. Moreover, its process depends on the Control/Baseline condition to have an accurate filtering.
Performs normalization, features selection and builds classification. DaMiRseq uses a thoughtful decision-making process for assisting the user in selecting the best putative predictors for classification. The software permits users to identify transcriptional biomarkers. It provides functions to filter genomic features and samples for cleaning up data, and to identify and remove the unwanted source of variation for adjusting data.
Offers an approach for approximating L0 penalized generalized linear model (GLM) adaptive ridge algorithm. L0 ADRIDGE is a method, based on a GLM, that consists of a MATLAB package providing various methods such as penalized Poisson or logistic regression. It intends to assist in (i) determining disease status (ii) supporting physicians in clinical decisions making (iii) easing features selection and prediction with big omics data.