Quality assessment software tools | Genome-wide association study data analysis
GWAS result files are prone to errors due to the vast amount of data they contain and the different manner in which these data are generated by individual cohorts. Before combining data from individual studies in a meta-analysis, it is important to ensure that all data included are valid, of high quality and compatible between cohorts to reduce both the false-positive and the false-negative findings.
Serves for an end-to-end multiparty computation (MPC) protocol for secure genome-wide association study (GWAS). Secure-GWAS suits for matrix multiplication, exponentiation and iterative algorithms with extensive data reuse patterns. This program utilizes cryptographic pseudorandom generators (PRGs) to diminish the overall communication cost. The main protocol can be used to detect a small number of significantly associated single-nucleotide polymorphisms (SNPs).
Enables tightly integrated comparative variant analysis and visualization of thousands of next generation sequencing (NGS) data samples and millions of variants. BasePlayer is a highly efficient and user-friendly software for biological discovery in large-scale NGS data. It transforms an ordinary desktop computer into a large-scale genomic research platform, enabling also a non-technical user to perform complex comparative variant analyses, population frequency filtering and genome level annotations under intuitive, scalable and highly-responsive user interface to facilitate everyday genetic research as well as the search of novel discoveries.
Provides advanced functionality (i) to perform file-level quality control (QC) of single genome-wide association (GWA) data-sets; (ii) to conduct quality control across several GWA data-sets (meta-level QC); (iii) to simplify data-handling of large-scale GWA data-sets.
Supports quality control and analysis of genome-wide association studies (GWAS). GWASTools provides functions for interactive investigation and includes intensity data. It can be used to verify pedigrees for accuracy, as well as to deduce pairwise relationships from. This tool can plot kinship coefficients and includes several options, including genotype cluster plots, B allele frequency (BAF)/ log R ratio (LRR) plots with chromosome ideograms, quantile-quantile plots and Manhattan plots.
An R-package for fast quality control and data handling of multiple data files obtained from genome-wide association studies (GWAS). Thought to be employed as a preprocessing tool in the meta-analysis of GWA data, GWAtoolbox can process multiple GWA data files in a few minutes. Output consists in an extensive list of quality statistics and graphical output, to give a comprehensive overview of the data that are going to be meta-analyzed.
An R package that automates the quality control of genome-wide association result files. Its main purpose is to facilitate the quality control of a large number of such files before meta-analysis. Alternatively, it can be used by individual cohorts to check their own result files. QCGWAS is flexible and has a wide range of options, allowing rapid generation of high-quality input files for meta-analysis of genome-wide association studies.
A pedigree-based analysis pipeline suite of programs geared towards SNPs and sequence data. PBAP performs quality control, marker selection and file preparation. PBAP sets up files for MORGAN, which can handle analyses for small and large pedigrees, typically human, and results can be used with other programs and for downstream analyses.
Estimates Bayes and local Bayes false discovery rates (FDR) for replicability analysis. Repfdr provides a way of performing analysis and theoretical justifications. This approach is a general method for assessing replicability in several studies when each study examines the same hypotheses. It can be used for applications like genome-wide association studies (GWAS) and other, as long as the marginal and non-null densities can still be reasonably well approximated for each study.
Provides approaches for efficient exploration and management of phenotype data. Proper QC of phenotypes before proceeding to the association analysis is critical to ensure control of type I and II errors, reliable effect estimates and consistent results between studies. PhenoMan is highly beneficial for the preparation of qualitative and quantitative trait data for association studies using new datasets as well as those obtained from public repositories.
Offers an assortment of tools suited for sequence analysis. Japsa is an open source package that gathers more than 20 tools including a java library and an API. The application provides a wide range of functionalities that allows users to split multiple sequences files, to perform real-time identification of antibiotic resistance gene with Oxford Nanopore sequencing as well as to normalize the branch length of a phylogeny.
An appropriate statistical method for genome-wide association study analysis of biomarkers whose measurements are constrained by limits of detection. lodGWAS depends upon the “survival” R package. It appropriately treats non-detects as censored data, and performs a genome-wide parametric survival analysis by including both ‘measured’ and ‘censored’ values. In this way, it allows full use of the available data.
Corrects genotype calls for improving genetic mapping in F2 and recombinant inbred line (RIL) populations. Genotype-Corrector simplifies genetic map construction in real F2 populations which contain more heterozygous loci than RIL populations. It can be applied on whole-genome genetic mapping studies. This tool provides two major parameters: the error rate of the homozygous genotype calls and the size of the sliding window.
Investigates and diminishes genome, metagenome, transcriptome and meta-transcriptome raw data. Kmernator can serve for deleting potential undesirable artifacts, including adapters, low-quality, N bases or short sequences. It provides a collection of features to remove redundant or errant data and the code can scale well on hundreds of computers tackling tera-base sized datasets.
Permits users to study massive genome-wide association. 'Caring without sharing' implements a genome-wide association study (GWAS) workflow: quality control (QC), population structure control, and association.
Aligns the genetics of neuropsychological traits to the molecular network of the human brain. This method makes uniform the process for data integration at genotype, phenotype and brain transcriptome level. It is useful for conducting a variant-based, genome-wide association investigation concentrated on neuropsychiatric disorders. This tool can include whole genome mRNA expression data of the human developing brain.
Automates the process of genotyping microsatellite repeats in Huntington disease (HD) data. ScaleHD is a pipeline designed to be used for large-scale automated genotyping of HTT GAC/CCG repeat parallel sequencing data. It performs quality control, sequence alignment and genotyping on all file pairs presented by the user as input. The pipeline consists of three main stages: sequence quality control (SeqQC), sequence alignment (SeqALN) and automated genotyping (GType).
A program for transforming sets of genotype data for use with the programs SNPTEST and IMPUTE. GTOOL can be used to 1) generate subsets of genotype data, 2) to convert genotype data between the PED file format and the file format used by SNPTEST and IMPUTE, 3) merge genotype datasets together and orient genotype data according to a strand file.
A typical use of QCTOOL is to compute per-sample and per-SNP summary statistics for a cohort, and use these to filter out samples and SNPs (either by removing them from the files or by writing exclusion lists). QCTOOL can also be used to perform various subsetting and merging operations, and to manipulate sample information in preparation for association testing.
Allows users to proceed genome wide association studies (GWAS). Goldsurfer is a standalone software that can be used for managing a complete GWA project. It is able to analyze results from publicly available studies, analyzes stratification by using calculating statistics methods such as Principal Component Analysis (PCA) or evaluate genotype data possible association. The software can analyze a dataset including more than 500 000 markers as many as 5 000 samples.
Allows users to denoise next generation sequencing (NGS) pyrosequenced reads. MUGAN provides a platform dedicated to the removal of errors by exploiting data-level parallelism. The software merges multiple graphics processing units (GPUs), central processing units (CPUs) to the AmpliconNoise software. It aims to perform a faster denoising of information as well as to provide an improved visualization of error-correction and diversity-estimation results.