Haplotype assembly software tools | De novo genome sequencing data analysis
Haplotypes play a crucial role in genetic analysis and have many applications such as gene disease diagnoses, association studies, ancestry inference and so forth. The development of DNA sequencing technologies makes it possible to obtain haplotypes from a set of aligned reads originated from both copies of a chromosome of a single individual. This approach is often known as haplotype assembly.
Allows haplotype assembly for diverse sequencing technologies. HapCUT can assemble haplotypes for a diverse array of data modalities. It implements an approach for modeling and estimating h-trans error probabilities de novo that reduce errors in assembled Hi-C haplotypes. It was assessed using data from fosmid-based dilution pool sequencing, 10X Genomics linked-read sequencing, single molecule real-time (SMRT) sequencing, and proximity ligation sequencing.
A software tool for phasing genomic variants using DNA sequencing reads, also called haplotype assembly. WhatsHap is a fixed parameter tractable (FPT) approach with coverage as the parameter. WhatsHap is especially suitable for long reads, but works also well with short reads.
An algorithm for haplotype assembly of densely sequenced human genome data. The HapCompass algorithm operates on a graph where single nucleotide polymorphisms (SNPs) are nodes and edges are defined by sequence reads and viewed as supporting evidence of co-occurring SNP alleles in a haplotype.
Enables single individual haplotyping. RefHap finds the best cut based on a heuristic algorithm for max-cut and then builds haplotypes consistent with that cut. The algorithm is able to perform whole chromosome haplotyping. It was tested with preliminary real data from fosmid-based sequencing and several simulation experiments were performed for testing the behavior of ReFHap under a wide range of circumstances.
A framework that uses distributed computational resources for gene quantification in metagenomes. Tentacle is implemented using a dynamic master-worker approach in which DNA fragments are streamed via a network and processed in parallel on worker nodes. Tentacle is modular, extensible, and comes with support for six commonly used sequence aligners. It is easy to adapt Tentacle to different applications in metagenomics and easy to integrate into existing workflows.
A probabilistic model for solving the haplotype assembly problem. The model has two mixture components representing two haplotypes. Based on the optimized model, a quality score is defined, which we call the 'minimum connectivity' (MC) score, for each segment in the haplotype assembly.