RaineriEtAl2018
Ascertains values of the first eigenvector of a Hi-C matrix. This algorithm exploits both the GC content of the sequence and a single whole genome bisulfite sequencing (WGBS) experiment to perform its predictions. It is also able to determine the positions of the A and B compartments. This model leans only on methylation and sequence information to delineate an efficient approximation.
S-ResNet
Classifies splice sites from raw DNA sequence. S-ResNet is built on a shallow version of ResNet and offers both advantage of shallow architecture and shortcut connection. This method uses a shortcut connection at each convolution layer that is different than the ResNet approach where the shortcut placed after a block consists of two convolution layers.
SupersonEtAl2018
Checks the effect of taxa for reconstructing phylogenetic tree with accuracy. This approach eliminates a user-defined number of species within a single phylum from a concatenated alignment of orthologous genes. It is based on standard phylogenetic reconstruction methodologies such as orthology determination, alignment and tree building and utilizes different sampling scenarios to change the taxon sampling within a desired phylum.
SN-NeRF / Self-Normalizing Natural Extension Reference Frame
Permits to fold protein atoms. SN-NeRF is a self-normalized method that can calculate Cartesian coordinates from torsion space parameters. It generates its three orthonormal vectors prior to self-normalizing.
WangEtAl2018
Offers an automated time structure learning model to automatically reveal the longitudinal genotype-phenotype interactions. This approach uses learned structures to improve the predictions of associations between genetic variations and longitudinal imaging phenotypes. This algorithm can simultaneously uncover interrelation structures existing in different prediction tasks and can be applied on both synthetic and real benchmark data.
MODIFI / Model Of Differential Interactions
Allows users to analyze how pathways and genetic interactions rewire over time. MODIFI identifies and characterizes differential interactions. This two-factor linear model can estimate the predictive strength and influence of time and differential compound treatment on pi-score. This estimation can be described as the slope by which an interaction changes (strengthens or weakens) over time.
ShammirEtAl2018
Consists of an automated and unbiased methodology for whole-brain investigation of the complex mesoscale laminar architecture of the cortex. This sphere-based approach implements a geometric solution based on cortical volume sampling using a system of virtual spheres dispersed throughout the entire cortex. This method can enable the expansion of studies on the role of cortical thickness in brain function and behavior to the cortical layer level.
RP / Reciprocal Perspective
Estimates a localized threshold on a per-protein basis using several rank order metrics. RP is a modeling framework consisting of data-driven approach to leverage the context provided by jointly considering facets of the pair-wise protein-protein interaction (PPI) relationships. This method was developed in such a way that it is amenable to any weighted complete graph problem. It can be used to determine a new assessment of the interaction. RP can be applied to fields other than PPI prediction.
AlleleHMM
Discovers allele-specific regions in functional genomic datasets. AlleleHMM identifies allele-specific blocks of signal in distributed functional genomic data if contiguous genomic regions share correlated allele-specific events. It leans then on the Viterbi algorithm to detect the most likely hidden states through the data to obtain a series of candidate blocks of signal with allelic bias. This tool supports identification in both coding and non-coding genomic regions.
DumitrascuEtAl2018
Identifies genotypic loci and covariates with effects on phenotypic variance. This method leans on a Bayesian test for heteroskedasticity (BTH) model ables to integrate discrete and continuous covariates. It also incorporates uncertainty in estimates of mean and variance effects of covariates to evaluate for variance quantitative trait loci (QTLs) and quantitative trait covariates (QTCs).
IMO / Ion Motion Optimization
Performs optimization by imitating the attraction and repulsion of anions and cations. IMO is a population-based algorithm inspired from properties of ions in nature. This algorithm divides the population of candidate solutions into two sets of negative charged ions and positive charged ions, and improves them according to the important characteristics of the ions: “ions with the same charges repel each other, but with opposite charges attract each other”. It also mimics liquid state and solid state to perform diversification and intensification.
IMOG
Allows users to predict protein folding reliably at high resolution. IMOG consists of a method that utilizes ions motion optimization (IMO) algorithm for performing its analysis. For instance, this algorithm can be used for studying a variety of amino acids sequence data sets. It can also be applied for determining protein folding structure.
TranslucentID
Provides a method that evaluates saturated DNA mixtures to identify saturated mixture contributors. TranslucentID proceeds by determining a subset of individuals who contributed DNA to saturated mixtures by performing mixture desaturation. This approach aims to underline the utility of mixture analysis of forensic samples with mixture single nucleotide polymorphism (SNP) panels.
eTumorRisk
Identifies high-risk individuals for cancers based on their germline genomic information. eTumorRisk is a network-based algorithm that contains (1) one component to build network models for discriminating a cancer sample from non-cancer samples, and (2) one component to determine which cancer type has the highest chance for the sample. It can (i) discriminate a cancer type from non-cancer samples, and between cancer types, (ii) filter out noise, (iii) identify multiple representative networks, and (iv) control false-positives.
BEAPR / Binding Estimation of Allele-specific Protein-RNA interaction
Offers a method to estimate allele-specific protein-RNA interaction. BEAPR consists of an algorithm that serves for the allele-specific binding (ASB) detection and prediction of functional genetic variants (GVs) in post-transcriptional gene regulation. Moreover, it employs an empirical Gaussian distribution to model the normalized read counts. The expected variance is estimated using a regression mode.
MAGE / Multiscale Adaptive Gabor Expansion
Aims to identify transient oscillatory burst amplitude and phase. MAGE is an algorithm that performs parameter reassignment to simplify discovery of a sparse decomposition by using a dictionary of parametric time-frequency-scale Gabor atoms.
LDpred-funct
Offers an approach dedicated to polygenic forecasting. LDpred-funct is an algorithm based on the leveraging of trait-specific functional enrichments and that performs an additional regularization step to account for sparsity. This method can be used to perform simulations with real genotypes. It was experimentally applied to UK Biobank and 23andMe cohorts to predict height through a meta-analysis.
SakellaropoulosEtAl2018
Provides a logic-based framework to reconstruct signaling networks by using phosphoproteomic data and prior knowledge about their connectivity. This algorithm allows cells to be interrogated in the presence or not of drugs or small molecules that inhibit specific interaction. It aims to permit researchers to design complex experiments and dependencies across networks.
ZamanighomiEtAl2018
Enables network discovery from perturbed expression data. This approach consists of a framework for network inference that relies on temporal gene expression data coupled to genetic or chemical perturbation. It is suitable for the processing of expression measurements from high-resolution time series experiments involving precise genetic or chemical perturbation of a steady state system.
LatulippeEtAl2018
Enables the study of the effects of amyloid beta peptide (Aβ) on intracellular Ca2+. This mathematical model for intracellular Ca2+ regulation consists of a theoretical approach that allows the understanding of the driving mechanisms for various Ca2+ oscillatory patterns within an Alzheimer’s disease (AD) environment. It can be used to understand the impact of Aβ on Ca2+ fluxes through individual regulatory components (such as IP3, RyR, and plasma membrane).
CzeizlerEtAl2018
Focuses on solving the target control problem. This method is based on an extension of an algorithm that searches the minimal solution of the structural target controllability problem, modified with additional heuristics intending to improve its efficiency. It can be applied to real-life-size networks and assists users in designing several therapeutic strategies using currently known drugs.
ZhaoEtAl2018
Provides a probabilistic framework for glioma detection and segmentation. This method leans on structure learning of undirected graphical models. It can perform structure learning and achieve glioma segmentation. It first over segments (MRI) images into superpixel regions to minimize computational cost and each superpixel serves for building undirected graph models. The main goal of this approach is to improve the accuracy of glioma segmentation.
GaoEtAl2014
Identifies an approximately minimum set of driver nodes to control a specified target set of nodes. This approach consists of a greedy algorithm (GA) based on the structural control theory: the system parameters are either fixed at zero or are independent free parameters. This algorithm can find the driver nodes for target control when the network structure is completely known.
LeTE-fusion
Performs comprehensive in silico analyses. LeTE-fusion gives an ideal estimation of peptide and variant peptide detections. It derives a realistic estimation of the percentage of detectable genome-annotated variants in shotgun mass spectrometry (MS) experiments using peptides with experimental evidence. This tool is useful for the assessment of feasibility of detecting other types of peptides or variations.
OstaszewskiEtAl2018
Serves for visual knowledge exploration in molecular interaction networks. This algorithm combines distance functions to cluster the contents of a complex visual repository on human disease, and to discover different cluster sets. Furthermore, it assists users in exploring complex biomedical repositories, and in annotating high-level areas of such maps.
RanjardEtAl2018
Maps the reads to a reference using dynamic time warping. This approach can update the used reference with insertions and deletions. It localizes, aligns and corrects this sequence with indels to simplify the subsequent read alignments. This method was created to compute unsupervised clustering of bioacoustic sequences. It can be employed with other techniques to investigate large genomic sequences.
Field Sensor
Assigns a field to each token or sequence of tokens in a query. Field Sensor processes by calculating a mapping between a query segment and a field, along with the likelihood of that mapping. This tool labels each segment of a query with a PubMed record field: text, title, author, journal, volume, issue, page and date.
RPCA / Relative Principal Components Analysis
Serves for analyzing the energetically relevant conformational changes of a biomolecule upon binding to various ligands. It can compute collective canonical variables such as linear combinations of the original features. It contains features for recognizing the conformational changes, which are relevant to the macroscopic thermodynamic change.
sNebula / similar Neighbor-edges based and unbiased leverage algorithm
Enables prediction of human leukocyte antigen (HLA)-peptide binding. sNebula is a method particularly useful for neoantigen identification and the development of immunotherapies. This tool can be used for constructing atlas of HLA-peptide binding that facilitates better understanding of the immune system. It can be applied to HLAs or peptides with or without experimental binding data.
DeepMQ
Provides an image analysis-based myelin sheath detection application. DeepMQ is an algorithm that consists of a feature extraction step with a deep learning based binary classification module. The images are acquired on a confocal microscope that contain three channels and multiple z-sections. Each channel represents either oligodendroyctes, neurons, or nuclei.
FSM / Fast Scalable Motif
Serves for network motif discovery. FSM is a scalable algorithm that consists of three different steps: subgraph enumeration, subgraph network motif classification and significance test. This algorithm aims to accelerate network motif discovery by lessening the number of times to perform subgraph isomorphism. It exploits multiple heuristic optimizations and subgraph classification to process.
XiangAndKim2018
Determines functional target genes by leveraging naturally occurring perturbation of gene expression by genetic variants. This approach performs a single statistical analysis to simultaneously construct the transcriptional regulatory network under single nucleotide polymorphism (SNP) perturbations and identify expression quantitative trait locus (eQTLs) perturbing this network, while incorporating transcription factor (TF) binding data as prior knowledge to guide the learning algorithm.
Trimitomics
Retrieves coding regions of mt genomes from RNA-seq data. Trimitomics proposes a three-steps pipeline that considers mitochondrial genomes within RNA-seq information, enabling their exploitation in several biological issues. This method enables information to be leveraged from extant datasets and intends to diminish costs as well as to enlarge fields of investigation dealing with mitochondrial transcriptional landscape.
LFFSA / fish Swarm Algorithm based on Levy Flight and Firefly behavior
Consists of a fish swarm algorithm based on Levy flight and firefly behavior. LFFSA incorporates the moving strategy of firefly algorithm into two behavior patterns of fish swarm: chasing behavior and preying behavior. This method takes into account attraction degree in the definition of artificial fish and uses Levy flight to adjust the search route of artificial preying fishes.
ORdensity
Finds differentially expressed (DE) genes that avoids some of the shortcomings of the individual gene identification. ORdensity is composed of two phases: discovering potential differentially expressed genes and recognizing differentially expressed genes. It can serve for the correct classification or diagnosis of future samples. This tool was tested on multiple gene expression data sets and four real cancer data sets.
STAR / SegmenTation based Approximation of the point-based sampling Milano Retinex
Permits image enhancement by using the Milano Retinex approaches. STAR is an algorithm that aims to cut down the computational burden of the image sampling and to reduce the number of operations needed to compute the lightness. It performs by employing coarse color and distance information, computed from clusters of pixels detected by a segmentation and independent of the target. It can also model locality by considering the mutual distances between the segments.
GABNI / Genetic Algorithm-based Boolean Network Inference
Allows users to infer generalized regulatory relations. GABNI is an algorithm designed for deducing the interaction type by examining the binary expression values of a target gene and a regulatory gene in the binary gene expression data. This method can be used for large-scale inference problems in terms of both structural and dynamics accuracies.
WangEtAl2018
Deduces target gene expression profiles using a conditional generative model. This approach stabilizes the adversarial training using lambda1-norm loss on the gene regression model. It is based on generative adversarial networks (GAN) to build large dimension outputs with no spatial structure. This tool provides robust predictions to the outliers and enables the capture of the low frequency structure of samples.
NadianEtAl2018
Automates spike sorting. This algorithm is based on a merging of two methods: the distributed stochastic neighbor embedding (t-SNE) and the density-based spatial clustering of applications with noise (DBSCAN). This application can be used with an important set of simultaneously recorded units. It was tested with simulated 10 minutes-long extracellular recordings as well as with real multi-electrode array neural recordings.
SIFT / Spherical Deconvolution Informed Filtering of tractograms
Quantifies the density of underlying white matter fibers. SIFT performs fiber orientation distribution (FOD) segmentation, and assigns streamlines to the FOD lobes they traverse, based on both the voxels they pass through and their tangent through each voxel. It combines the quantitative properties that can be assigned to a whole-brain reconstruction to sample the structural connections emanating from a region or regions of interest.
TiwariEtAl2018
Retrieves 3D structural models in libraries of biological shapes. This approach compares some experimental image data to the projection images from existing structural data to work. It can resize the 3D models to determine the volumetric size, allowing for the possibility for a small novel protein to have a similar shape to a large protein complex. This tool can recognize possible shapes for novel single particle and estimate the number of conformations that can be present in experimental data.
LowEtAl2018
Allows characterization of population activity as a trajectory on a nonlinear manifold. This algorithm captures correlations between neurons and temporal relationships between states, constraints arising from underlying network architecture and inputs. Furthermore, it can find broader use in probing the organization and computational role of circuit dynamics in other brain regions.
Modular inverse reinforcement learning
Permits estimation of both rewards and discount factors from human behavioral data. 'Modular inverse reinforcement learning' consists of an algorithm that enables predictions of human navigation behaviors in virtual reality across different subjects and with different tasks. Moreover, it supplies a strategy for estimating the subjective value of actions and how they influence sensory-motor decisions in natural behavior.
SQUICH / SeQUential DepletIon and enriCHment
Consists of a method for molecular sampling. This approach uses computations performed by molecular ensembles to encode the abundance of each species in a sample before measurement. It can quantify each of a large number of species of molecules in a pool. It is useful for measuring massive single-cell RNA profiles. This algorithm enables logarithmic or even sub-logarithmic sampling for precision desired in ubiquitous sequencing applications.
JiEtAl2018
Automates the classification of cells into cell-types. This method combines prior knowledge with observed cytometry data to proceed. It is based on a Bayesian solution, allowing users to integrate biologically-meaningful prior information that captures the domain expertise of human experts. This approach returns individual cells hierarchically-structured, that model the tree-structured recursive process of manual gating.
muMAPseq / multisource Multiplexed Analysis of Projections by sequencing
Determines mesoscale connectivity networks in individual animals. muMAPseq constructs relevant mesoscale connectivity atlases for individual labs’ particular model system. It offers a systematic foundation for investigating circuits in mouse models in which connectivity deviates from that of C57BL/6J males. This method is useful in a wide range of non-standard animal model systems, including peromyscus, voles, marmosets and others.
MajumderEtAl2018
Measures the activity of daptomycin on Streptococcus aureus strains with different membrane compositions. This method can determine activity of daptomycin on Streptococcus aureus strains with different membrane compositions. It employs an artificial neural network (ANN) model to give a relation between membrane composition and activity. This approach takes into consideration the effect of the same drug candidate on multiple membrane compositions.
ChaiEtAl2018
Provides a logistic regression model combining semi-supervised learning and active learning for disease classification. This algorithm does not require significant engineering overhead to process and it uses unlabeled gene expression samples in disease classification to obtain results. Its regression model is based on the complementarity of semi-supervised learning and active learning. It can also minimize the false pseudo-labeled samples via an update pseudo-labeled samples mechanism embedded in the method.
HeEtAl2018
Consists of a two-stage biomedical event trigger detection approach. This method includes two subtasks: trigger recognition and classification. It is able to alleviate the problem of class imbalance, and different features are selected in each stage. This approach also integrates word embeddings for representing words semantically and syntactically. It was evaluated on the multi-level event extraction (MLEE) corpus test dataset.
Covariate Assisted Principal regression
Identifies the components predicted by linear models of the covariates. Covariate Assisted Principal regression is an algorithm for multiple covariance matrix outcomes. This method avoids the massive number of hypothesis testing suffered in the element-wise regression approach. Applied to resting-state functional magnetic resonance imaging data, this approach identifies the human brain network changes associated with age and sex.
OPLRAreg
Assists users to develop quantitative structure‐activity relationship (QSAR) models. OPLRAreg is a piecewise linear regression algorithm that can determine features to separate the data into regions and detect linear equations to predict the outcome variable in each region. This algorithm is designed to permit researchers to add customized constraints to the model.
AIM-SNPtag
Chooses the most membership informative single nucleotide polymorphisms (SNPs) that can be potentially applied to forensic science. AIM-SNPtag can find ancestry-informative markers (AIMs) for ancestry or membership inference. It was assessed with the Monte Carlo cross-validation procedure. This tool can be applied to multiple-population genome-wide SNP data. It is useful for deducing an individual’s continental or biogeographic origins.
S-QFC / Secure Quaternion Feistel Cipher
Offers a method dedicated to the encryption of DICOM images. S-QFC is a quaternion encryption algorithm based on a modified Feistel network with a modular arithmetic in the quaternion field. This approach intends to improve the efficiency of the security process by the exploitation of a both-sided, modular matrix multiplication coupled to the use of quaternion Julia sets and of a fractal division process.
WangEtAl2018
Investigates the noise performance of short-pulse lasers using dynamical methods. This approach is useful for optimizing the design of short-pulse lasers. It employs a parabolic gain profile for the amplifier. This tool is useful for characterizing a laser that is locked using a fast saturable absorber and a laser that is locked using a slow saturable absorber.
SSC-LRR / self-training Subspace Clustering algorithm under Low-Rank Representation
Assists users for cancer classification on gene expression data. SSC-LRR consists of an algorithm that integrates self-training subspace clustering (SSC) and low-rank representation (LRR). It considers the three characteristics of gene expression data: the high-dimensionality, the small sample size, and the existence of unlabeled data. Moreover, this algorithm is composed of self-training technique to exploit information from unlabeled gene expression data.
iSCHRUNK / in Silico approach to CHaracterization and Reduction of UNcertainty in the Kinetic models
Permits to characterize uncertainties and uncover intricate relationships between the parameters of kinetic models and the responses of the metabolic network. iSCHRUNK combines parameter sampling and machine learning techniques. It allows users to identify a small number of parameters that determines the responses in the network regardless of the values of other parameters.
IUP / Iteratively Updated Priors
Executes successive personalisations of the cases in a population in large databases. IUP performs successive personalisations through maximum a posteriori (MAP) where the prior probability at an iteration is set from the distribution of personalized parameters in the database at the previous iteration. This leads the parameters to lie on a reduced linear subspace dimension in which for each case of the database there is a possibly unique parameter value for which the simulation fits the measurements.
DuboseEtAl2018
Consists of retina layer-specific statistical intensity models of the optical coherence tomography (OCT) images. This approach presents physically derived and empirically validated layer-specific statistical models of the intensity in retinal OCT images, which were used to calculate the unbiased and biased Cramer-Rao lower bounds (CRLB) for estimating the layer boundary locations in retinal OCT images. These statistical models can serve for improvements to OCT image denoising, reconstruction, and other applications.
CAUSAL-Imp
Computes summary statistics for unobserved single nucleotide polymorphisms (SNPs) by conditioning on the statistics of the observed SNPs and given causal status. CAUSAL-Imp combines the principle of fine mapping and summary statistics imputation. It can impute the association statistics at untyped variants while taking into account variants in the region that may affect the trait. This method considers all the possible causal statuses where any subset of SNPs can be causal.
VSRFM / Variants Stacked Random Forest Model
Predicts the effect of variants. VSRFM is a stacked meta learner for deleteriousness classification, that was built using supervised machine learning over a composed data set which contains pathogenic and benign variants, obtained selecting unique variants from five benchmark datasets HumVar, ExoVar, VariBench, predictSNP and SwissVar. This model was constructed for variants not involved in splicing. It can be useful for improving pathogenic mutation detection.
VSRFM-s / Variants Stacked Random Forest Model for splicing
Allows users to perform pathogenic prediction. VSRFM-s consists of a variants stacked random forest model for variants affected by splicing. It was built using supervised machine learning over a composed data set which contains pathogenic and benign variants, obtained selecting unique variants from five benchmark datasets HumVar, ExoVar, VariBench, predictSNP and SwissVar. This program can be used for deleteriousness classification and for improving pathogenic mutation detection.
EDES / Ensemble-Docking with Enhanced-sampling of pocket Shape
Exploits short metadynamics simulations of the apo protein of interest to generate a set of druggable (holo-like) conformations. EDES was developed with the set of collective variables to sample in a controlled manner maximally different shapes of the binding site, and a multi-step clustering strategy allowing to retain a large fraction of holo-like structures within the pool of cluster representatives. This method can be employed in ensemble-docking.
POET / Population Outcome Enrichment Technique
Reveals subpopulations where the pharmacological response between compounds agree and diverge. POET is a population segmentation algorithm consisting of an unsupervised machine learning technique. This method discovers subpopulations of cell lines in which two or more compounds, possibly addressing the same disease state or targeting the same genetic alteration, have a common pharmacological pattern of response. It was able to integrate multiple measures of drug response to identify subpopulations that differentiate response to inhibitors of the same or different targets.
MCE / Markov Chain Entropy
Aims to represent the potency of a sample with a single-cell RNA sequencing (scRNA-Seq) or bulk RNA sequencing (RNA-Seq) profile. MCE is a program that requires the normalized RNA-seq profile and a connected signaling interaction network between the genes defined in the profile to work. Moreover, this method can serve for inferring a gene regulatory network from the scRNA-Seq data itself.
ChamanzarEtAl2018
Allows users to identify cortical spreading depolarizations (CSDs) using electroencephalography (EEG) signals. This algorithm intends to detect different types of CSD waves, including narrow and complex patterns of CSD, using HD-EEG under specific conditions. Its analysis aims at being noninvasive and automated. This approach was tested on simulated electroencephalography (EEG) signals.
GT-TS / Good-Toulmin like estimator via Thompson Sampling
Provides an approach to experimental design for cell type discovery. GT-TS uses information across tissues to inform subsequent experiments in order to maximize cell type diversity and discovery. This method can be immediately applied to improve the effectiveness of experimental studies with alternative goals, such as designing sampling techniques for diversifying location-dependent tumor cell type heterogeneity.
MRH-SiNeC / Multi Reference Hill-climbing SIgnaling Network Constructor
Enables signaling network construction. MRH-SiNeC is a method for inferring topology of the signaling network using multiple reference networks, RNA interference (RNAi) data and the phylogenetic distances between these networks. It is based on the conjecture that the topological distances between the signaling networks of different species depend on their evolutionary distance. This method starts by applying SiNeC on each individual reference network and removes any inconsistency imposed by the RNAi constraints.
MR-SiNeC / Multi Reference Signaling Network Constructor
Enables signaling network construction. MR-SiNeC is a method for inferring topology of the signaling network using multiple reference networks, RNA interference (RNAi) data, and the phylogenetic distances between these networks. It is based on the conjecture that the topological distances between the signaling networks of different species depend on their evolutionary distance. This method solves the problem which reduces the running time by combining all the individual reference networks as the starting point.
SiNeC / Signaling Network Constructor
Furnishes a method for reconstructing the topology of a signaling network. SiNeC performs in three steps: (1) it estimates the approximate ordering of the critical genes in the reference network; (2) it removes edges that are in conflict with an order from the reference network; and (3) it inserts the missing edges that are necessary to ensure the flow between consecutive critical genes and the consistency of the remaining genes in the reference network.
S-SiNeC / Scalable Signaling Network Constructor
Enables large-scale signaling network reconstruction. S-SiNeC can construct networks involving hundreds of proteins with minimum sacrifice in optimality. This method has polynomial time complexity, but may fail to return a network that satisfies all the constraints enforced by the RNA interference (RNAi) data. It can be useful for biologists to construct novel signaling networks from in vivo or in vitro screening experiments.
RMODI / Regression MODelability Index
Forecasts results of the regression models for datasets of molecules. RMODI is an index able to consider nearest neighbors and the cardinality of the neighborhood to each molecule. This algorithm permits users to avoid unnecessary tasks or to depurate the molecule composition of a dataset of interest. It was tested on forty datasets gathered from different sources.
BaoEtAl2018
Predicts potential modified sites via a machine learning method with the features of amino acid residues. This algorithm uses an approach that carry out to delete redundant potential samples. It generates support vector machine (SVM) and multi-layer neural network models to predict the modified sites and non-modified sites based on the features selected. The SVM feature also allows identification of the post modification residues in the field of proteomics.
GeRe-ILP
Classifies oriented gene orders which includes three types of weighted rearrangement operations - transposition, inversion, and inverse transposition. GeRe-ILP consists of an integer linear programming (ILP) approach offering exact minimum-weight genome rearrangement scenarios for signed gene orders with arbitrary weights. This method is useful for different types of rearrangement operations.
DECtp / Differential Expression Caller by combining tumor purity information
Detects differentially expressed genes (DEGs) between tumor and normal samples. DECtp is a method leaning on the adjustment of tumor purity in differential expression (DE) calling. This approach generates a mixed Gaussian distribution by considering expression profiles of tumor sample. Then, the algorithm performs a generalized least square procedure to call differential expressions and a Wald test.
WangEtAl2018
Consists of a deep-learning algorithm, integrated with a multi-threaded processing system, for the automatic detection of polyps during colonoscopy. This algorithm can assist the assessment of differences in polyp and adenoma detection performance among endoscopists. It was validated using two image studies and two video studies.
deepMc / deep Matric completion
Offers an imputation technique for single cell RNAseq (scRNAseq) data. deepMC is an application that does not assume any distribution for gene expression, based on a combination of deep matrix factorization and deep dictionary learning methods. This program can be applied to large datasets and has been tested with datasets originated from four different studies.
NetGO
Aims to improve large-scale automated function prediction (AFP) with massive network information. NetGO is an AFP method addresses: (1) the label side of multilabel classification problem by using learning to rank (LTR) and (2) the instance (protein) side by incorporating network-based information. This approach enables the incorporation of network information at a large-scale level. It was validated by conducting comprehensive experiments on large-scale datasets under the critical assessment of functional annotation (CAFA) settings.
D-GPM / Deep-Gene Promoter Methylation Inference
Predicts whole genome promoter methylation level, based on the methylation profile of the landmark genes. D-GPM is a multi-layer deep neural network whose performance was benchmarked against linear regression (LR), regression tree (RT), as well as support vector machine (SVM) with regards to methylation profile data based on Illumina Human Methylation 450k from The Cancer Genome Atlas (TCGA).
miPrimer
Designs primer pairs with acceptable quantitative polymerase chain reaction (qPCR) efficiency for templates other than micro-RNA (miRNA). miPrimer is an empirical-based method developed by learning from several failed cases during miRNA primer design phases. Furthermore, it is able to distinguish members of the same miRNA family by increasing primer specificity while reducing the primer dimer issue.
Gene Ranker
Assists users with key gene identification in immune diseases. Gene Ranker is an in-silico method that initially constructs a backbone network based on protein interactions. It can predict key genes even when there are few known genes. It employs the semi-supervised learning for gene scoring. This method is disease-specific and consists of three steps: (1) network construction, (2) network selection and integration, and (3) key gene scoring.
AIDE / Annotation-assisted Isoform Discovery and abundance Estimation
Verifies false isoform discoveries by implementing the statistical model selection principle. AIDE employs a stepwise likelihood-based selection approach to find gene and exon boundaries from annotations and borrow information from the annotated isoform structures. This method can determine the abundance of the identified isoforms in the process of isoform reconstruction.
MiMSeg / Mixture Model based Segmentation
Allows automated detection of tumor tissue on nuclear magnetic resonance (NMR) apparent diffusion coefficient maps. MiMSeg is an algorithm that enables users to reveal tumour heterogeneity by identification of the clusters consisting of gaussian mixture models (GMM) components related to the homogenous areas within tumour tissue.
AdResS / Adaptive Resolution Simulation
Allows an on-the-fly interchange between the atomistic (AT) and grand canonical (CG) description (and vice versa) of the molecules according to their position in space. AdResS is a method that permits users to control basic thermodynamic and structural properties in the transition region.
BurdickEtAl2018
Serves for sepsis prediction and detection. This program consists of a machine learning algorithm (MLA) that can determine risk of sepsis using data from patient electronic health records. This tool is designed to improve patient outcomes in a variety of clinical settings. Moreover, it can determine severe sepsis using six frequently collected patient measurements.
APD / Advanced Peak Determination
Performs peak detection in survey scans (MS1) to increase the number of precursors selected for unimolecular dissociation (MS2). APD is a peak picking algorithm developed to increase the number of peptides identified in label-free peptide experiments. Its benefit comes from its ability to identify overlapping isotope distributions for MS2 acquisition. This algorithm should not be used in combination with MS2-based quantitative proteomic analyses employing isobaric mass tag labeling.
ConDock
Predicts physically plausible ligand binding sites by combining information from ligand docking and surface conservation. ConDock is an hybrid strategy that combines information from surface conservation with intermolecular interactions from docking calculations. This method was used to predict viable ligand binding sites for four different G-protein coupled estrogen receptor (GPER) ligands.
PREMONition / PREdicting Molecular Networks
Predicts molecular circadian clock associations using functional relationships. PREMONition is an algorithm based on the incorporation of proteins encoded by known clock genes (when available), rhythmically expressed clock-controlled genes and non-rhythmically expressed but interacting genes into a cohesive network. The software can be used to identify candidate clock-regulated processes and thus candidate clock genes in other organisms.
FuncSFA / Functional Sparse-Factor Analysis
Furnishes a continuous characterization and a functional interpretation of the variation across tumors at the molecular level. FuncSFA is composed of three parts designed to: (1) compute the sparse-factor analysis to obtain the factors, (2) interpret the obtained factors in terms of the possible biological processes they represent and (3) reveal the biological processes likely giving rise to the molecular profiles observed for that sample.
SMMN / Specific Modules in Multiple Networks
Discovers the condition-specific modules by considering multiple networks. SMMN is a heuristic algorithm that provides several insights: (i) characterizing condition-specific modules by taking into account multiple networks is effective to guarantee the specificity and modularity, (ii) the integrative analysis of multiple networks and (iii) the condition-specific modules capturing various features in topology and function, providing insights into the mechanisms of cancers.
FuzCav
Allows systematic comparison of protein-ligand binding sites. FuzCav is a generic cavity fingerprint that identifies similarities between ligand-free and ligand-bound active sites. This algorithm does not require a prior 3D structural alignment of proteins to compare and is applicable to any druggable cavity from any protein class. It was used in scenarios such as (1) screening a collection of binding sites for similarity to different queries, (2) classifying protein families by binding site diversity, and (3) discriminating adenine-binding cavities from decoys.
Qiu2018
Allows users to cluster cell based on the binarized data. This solution consists of a co-occurrence clustering algorithm that works with binarized single-cell RNA-seq count data. This tool can detect cell populations, as well as cell-type specific pathways beyond variable genes. Moreover, it processes in two steps: gene pathway identification and cell type discovery.
B-COSFIRE
Assists users to perform vessel segmentation in retinal fundus images. B-COSFIRE includes features for identifying patterns in videos. This method optimizes the suppressing mechanism for the filter input and output thresholds. Especially, three parameters are optimized: preprocessing threshold, post-processing threshold, and background artifact size are chosen for optimization.
FA-VNR / Fragment-Aware Virtual Network Reconfiguration
Allows fragment-aware virtual network reconfiguration. FA-VNR is a heuristic algorithm that selects (i) the set of virtual nodes to be migrated according to the fragment degrees of the physical nodes, and (ii) the best virtual node migration scheme according to the reduction of the fragment degrees of the physical nodes as well as the reduction of the embedding cost of the embedded virtual networks.
EP-DNN / Enhancer Prediction using Deep Neural Network
Allows users to determine enhancers based on chromatin features in different cell types. EP-DNN consists of a deep neural network-based global enhancer prediction algorithm. It enables researchers to detect enhancers in two distinct cell types, namely the human embryonic stem cell type (H1) and a differentiated primary lung fibroblast cell line (IMR90).
Snooker
Allows users to generate pharmacophore hypotheses for compounds binding to the extracellular side of the structurally conserved transmembrane (TM) domain. Snooker can be used for detecting receptor-specific ligands and ligand binding residues in cross-screen. Moreover, this tool is suitable for apo-proteins and can be applied to all receptors of the G-protein coupled receptors (GPCR) protein family.
EC-PSI / EC-Pfam Statistical Inferencing
Infers high confidence associations between enzyme commission (EC) numbers and Pfam domains. EC-PSI is designed to directly find associations from existing EC-chain associations from SIFTS and EC-sequence associations from SwissProt and TrEMBL. This algorithm collects and integrates a large number of existing EC-chain/sequence annotations, allowing it to deduce over 8000 direct EC-Pfam associations with respect to the manually curated InterPro database.
LPI-NRLMF / lncRNA-Protein Interactions prediction by Neighborhood Regularized Logistic Matrix Factorization
Predicts the potential long non-coding RNA (lncRNA-protein) associations. LPI-NRLMF is a matrix factorization computational approach for uncovering lncRNA-protein relationships. This method adopts a semi-supervised learning strategy, which deduces unknown data mainly by known interactions and their similarities, so negative samples are not needed. It was assessed by performing a cross validation of known experimental lncRNA-protein scores.
AbassEtAl2018
Detects the position of the human eye limbus in three dimensions and measures the full 360˚ visible iris boundary. This approach presents a non-parametric method for eye limbus detection and a dynamic method for measurement of the white-to-white distance along the eye horizontal line, which is used as a predictor of the limbus, sulcus, and effective intraocular lens position (ELP) in some important clinical applications.
BARTMAP / Biclustering ARTMAP
Performs biclustering on gene expression data, particularly for cancer classification discovery. BARTMAP is a biclustering algorithm adapted to and modified from a neural-based classifier, Fuzzy ARTMAP. It consists of two Fuzzy ART modules communicated through the inter-ART module. This method is able to detect atypical patterns during its learning. It can be used with types of data that have high dimensionalities.
SB-CWT / Continuous-Wavelet-Transform-based Sub-Band rPPG
Serves for the decomposition of the RGB signals. SB-CWT enables the usage of a weighting function based on the global energy distribution which serves as an additional filter of undesired signal components. This tool can be used for combining the individual sub-band pulse signals into a single output pulse signal. Furthermore, this algorithm was tested on the publicly available MMSE-HR dataset.