Cancer evolution software tools | Phylogenomics data analysis
Cancer can be a result of accumulation of different types of genetic mutations such as copy number aberrations. The data from tumors are cross-sectional and do not contain the temporal order of the genetic events. Finding the order in which the genetic events have occurred and progression pathways are of vital importance in understanding the disease.
A generative probabilistic model for detecting patterns of various degrees of mutual exclusivity across genetic alterations, which can indicate pathways involved in cancer progression. TiMEx explicitly accounts for the temporal interplay between the waiting times to alterations and the observation time. In simulation studies, we show that our model outperforms previous methods for detecting mutual exclusivity. On large-scale biological datasets, TiMEx identifies gene groups with strong functional biological relevance, while also proposing new candidates for biological validation. TiMEx possesses several advantages over previous methods, including a novel generative probabilistic model of tumorigenesis, direct estimation of the probability of mutual exclusivity interaction, computational efficiency and high sensitivity in detecting gene groups involving low-frequency alterations.
Allows the identification of mutational signatures within a single tumor sample. The deconstructSigs approach determines the linear combination of pre-defined signatures that most accurately reconstructs the mutational profile of a single tumor sample. It uses a multiple linear regression model with the caveat that any coefficient must be greater than 0, as negative contributions make no biological sense. Application of deconstructSigs identifies samples with DNA repair deficiencies and reveals distinct and dynamic mutational processes molding the cancer genome in esophageal adenocarcinoma compared to squamous cell carcinomas. deconstructSigs confers the ability to define mutational processes driven by environmental exposures, DNA repair abnormalities, and mutagenic processes in individual tumors with implications for precision cancer medicine.
Allows tumor growth simulation in C++. tumopp offers many setting options so that simulations can be carried out under various settings. Setting options include how the cell division rate is determined, how daughter cells are placed, and how driver mutations are treated. Furthermore, to account for the cell cycle, a gamma function has been introduced for the waiting time involved in cell division. tumopp also allows simulations in a hexagonal lattice. Using tumopp, it was investigated how model settings affect the growth curve and intratumor heterogeneity (ITH) pattern. It was found that, even under neutrality (with no driver mutations), tumopp produced dramatically variable patterns of ITH and tumor morphology, from tumors in which cells with different genetic background are well intermixed to irregular shapes of tumors with a cluster of closely related cells. This result suggests a caveat in analyzing ITH data with simulations with limited settings, and tumopp will be useful to explore ITH patterns in various conditions.
Reconstructs evolutionary paths and ancestral genotypes from sequenced tumor samples. BML first estimates the probability P(g) that a particular combination of mutations (denoted by genotype g) reaches fixation in a cell population that has evolved from a normal cell genotype and will eventually attain a tumor cell genotype. BML uses both observed tumor samples and imputed evolutionary paths to estimate P(g). The evolutionary probabilities are represented by a Bayesian network (up to an overall normalizing factor) that is optimized for the best choice of imputed paths. Once a Bayesian network is selected, a recursive algorithm is used to infer the likely Evolutionary Progression Paths (EPP). This software package is freely available for download.
A software package to estimate mixture models of mutagenetic trees from observed cross-sectional data. Mutagenetic tree mixtures are probabilistic models that have been designed to describe evolutionary processes that are characterized by the accumulation of genetic changes. Mtreemix has been applied to model the development of drug resistance-associated mutations in the HIV genome and the accumulation of chromosomal gains and losses in tumor development.
Reconstructs tumor subclonal phylogenies using somatic mutation cellularities in patient's tumor sample(s). SCHISM combines information about somatic mutation cellularity (aka mutation cancer cell fraction) across all tumor sample(s) available from a patient in a hypothesis testing framework to identify the statistical support for the lineage relationship between each pair of mutations or mutation clusters. The results of the hypothesis test are represented as Cluster Order Precedence Violation (CPOV) matrix which informs the subsequent step in SCHISM and ensures compliance of candidate tree topologies with lineage precedence rule. Next, an implementation of genetic algorithm (GA) explores the space of tree topologies and returns a prioritized list of candidate subclonal phylogenetic trees, most compatible with observed cellularity data.
A specific probabilistic graphical model for the accumulation of mutations and their interdependencies. The Bayesian network models cancer progression by an explicit unobservable accumulation process in time that is separated from the observable but error-prone detection of mutations. Model parameters are estimated by an expectation-maximization algorithm and the underlying interaction graph is obtained by a simulated annealing procedure.