Privacy-preserving data exploration software tools | Genome-wide association analysis
Genome-wide association studies (GWAS) are a cornerstone of genotype–phenotype association in humans. These studies use various statistical tests to measure which polymorphisms in the genome are important for a given phenotype and which are not. With the increasing collection of genomic data in the clinic, there has been a push towards using this information to validate classical GWAS findings and generate new ones. Unfortunately, there is growing concern that the results of these studies might lead to loss of privacy for those who participate in them.
Aims to protect privacy in transmission disequilibrium test (TDT). dpTDT is a differentially private mechanism based on test statistics, p-values, and the shortest Hamming distance (SHD) scores. It protects privacy for trio-families but can be extended to the families with more than one child. The tool shows efficient approximation results but the sensitivity of the SHD score is not guaranteed to be one any more.
Enables checking the quality of data in a privacy-preserving way without revealing sensitive information to a potential adversary. Secure Quality Control (SQC) employs state-of-the-art cryptographic and statistical techniques for privacy protection. This meta-analysis pipeline operates with real data to demonstrate the efficiency and scalability on commodity machines. It offers an effective balance between the needs of researchers for genome-wide association studies (GWAS) meta-analysis and the needs of data owners to respect the genetic privacy of research participants.
A framework to facilitate secure rare variants analysis with a small sample size. We target at the algorithm design aiming at reducing the computational and storage costs to learn a homomorphic exact logistic regression model (i.e., evaluate p-values of coefficients), where the circuit depth is proportional to the logarithmic scale of data size.
Aims to support real-time infectious disease outbreak investigations and pathogen surveillance using genomic data. IRIDA is a platform for analytics and visualizations of whole genome sequencing (WGS)-based microbial pathogen investigations that: (1) allows data management and controlled, collaborative data sharing, (2) provides analysis pipelines for public health genomics, enables data integration by implementing ontologies, and (3) permits data visualization.
Calculates the neighbor distance. DiffPriv ensures the privacy of individuals in both the case and control cohorts. It follows three steps: (1) stating the problem as an optimization problem; (2) solving a relaxation of this problem in constant time; and (3) rounding the relaxed solution to a solution to the original problem. The tool was applied to real genome-wide association studies (GWAS) data. It allows researchers access to the database while minimizing privacy concerns.
Allows outsourcing genetic tests using Software Guard Extension (SGX). PRESAGE supports genomic queries, which count genomic records by matching a set of biomarkers in the VCF files. It provides a remote attestation step that ensures to identify a trustworthy enclave and build a secure channel between the data owner/users and the enclave. This tool is able to defend malicious attack thank to the implementation of a minimal perfect hash (MPH) scheme.
Provides computation efficiency for real-world, secure, international collaboration for rare disease analysis. PRINCESS applies lightweight cryptographic technologies and uses the Software Guard Extensions (SGX) computing architecture. It is designed to be scalable and easy to extend with support of plug-in modules for new features/new tasks. The tool enhances the protection of encrypted data by using a time-varying initialization vector.