Willows statistics

Tool stats & trends

Looking to identify usage trends or leading experts?


Willows specifications


Unique identifier OMICS_25290
Name Willows
Software type Application/Script
Interface Graphical user interface
Restrictions to use Academic or non-commercial use
Input data A file where the first line indicates the variable type (response, nominal or ordinal) with no particular order.
Input format TXT
Output data A tree structure.
Operating system Unix/Linux, Mac OS, Windows
Programming languages C++, Java
Computer skills Advanced
Version 1.1
Stability Stable
Maintained Yes


No version available


  • person_outline Heping Zhang
  • person_outline Minghui Wang
  • person_outline Xiang Chen

Publication for Willows

Willows citations


Incorporating epistasis interaction of genetic susceptibility single nucleotide polymorphisms in a lung cancer risk prediction model

PMCID: 4902078
PMID: 27121382
DOI: 10.3892/ijo.2016.3499

[…] was selected as the one with the maximum prediction accuracy and cross-validation consistency and evaluated statistically using 1000-fold permutation test.For comparison, we used the freely available Willows software package for generating RF (). RF ranks variables by a variable importance index, a measure which reflects the ‘importance’ of a variable on the basis of the classification accuracy, w […]


The phenotypic manifestations of rare genic CNVs in autism spectrum disorder

Mol Psychiatry
PMCID: 4759095
PMID: 25421404
DOI: 10.1038/mp.2014.150

[…] selected predictor variables (here, clinical phenotype variables). A collection of each of these trees is termed a forest., , Here, a random forest analysis using the DOS command line version of the Willows software package was used to investigate the phenotypic differences between cases with and without CNVs impacting ASD/ID or DBE genes. Forests were created from 10 000 trees with a minimum ter […]


Data mining in the Life Sciences with Random Forest: a walk in the park or lost in the jungle?

Brief Bioinform
PMCID: 3659301
PMID: 22786785
DOI: 10.1093/bib/bbs034

[…] lled Random Jungle (RJ) was developed []. It is currently the fastest implementation of RF, allows parallel computation of trees and is therefore very suited for the analysis of genome-wide data. The Willows package was also designed for tree-based analysis of genome-wide data by maximizing the use of computer memory []. The WEKA workbench [] is a data mining environment that includes several mach […]


Bioinformatics challenges for genome wide association studies

PMCID: 2820680
PMID: 20053841
DOI: 10.1093/bioinformatics/btp713
call_split See protocol

[…] on (McKinney et al., ). Advantages of this approach include its basis on decision trees and the availability of the algorithm in many different open source software packages including R. In fact, the Willows package was designed specifically for tree-based analysis of SNP data (Zhang et al., ). […]


Looking to check out a full list of citations?

Willows institution(s)
Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, CT, USA
Willows funding source(s)
Supported by grants K02DA017713 and R01DA016750 from the National Institutes on Drug Abuse.

Willows reviews

star_border star_border star_border star_border star_border
star star star star star

Be the first to review Willows