Similar protocols

Pipeline publication

[…] were obtained from ENCODE. Peak signal intensities were obtained at these peaks, library size normalized, and correlated between cell types using Pearson's correlation. For Fig. , differential DNase peaks were defined as FDR (Benjamini–Hochberg corrected t test p values) < 0.05. For this analysis, data were grouped by hotspot size (since discSNP count/hotspot correlates with hotspot length) and the % of peaks with differential DNase signal was determined relative to peaks of that length containing no discSNPs., These merged peak locations were later used as consensus locations for discSNP analysis., Input ChIP-seq fastq files were obtained from ENCODE for each cell line analyzed. We used Trim Galore! (v0.4.1, parameters: phred33, -quality 20) to remove adapters and low-quality reads (ends < 20 bp; read length < 20 bp) and then bwa (v0.7.5a-r405, default parameters) to align these reads to the mm9 or hg19 genome builds. Picard (v1.121,default parameters; Picard Tools available was applied to assign read groups and discard unmapped reads. We used multiple functions from GATK (v3.3, parameters:—variant_index_type LINEAR –variant_index_parameter 128000 –emitRefConfidence GVCF –T Haplotype Caller –genotyping_mode DISCOVERY—stand_emit_conf 10 –stand_call_conf 30) to identify variant nucleotide positions and the number of reference/variant reads found at these positions across all fastq files. We only considered variants with a total read depth > 5 and considered both read depth and variant read fraction in defining a homozygous variant nucleotide (Supplementary Table )., Following these assignments, DNA copy (v1.44.0, default parameters) was used to identify blocks of the genome containing at least ten variant calls where at least 90% of the calls share a zy […]

Pipeline specifications

Software tools Trim Galore!, BWA, Picard, GATK