Computational protocol: Loss of BRCA1 or BRCA2 markedly increases the rate of base substitution mutagenesis and has distinct effects on genomic deletions

Similar protocols

Protocol publication

[…] Library preparation used the TruSeq DNA Nano Library Preparation Kit (Illumina, San Diego, CA, USA) or the NEBNext Ultra DNA Library Prep Kit for Illumina (New England Biolabs, Ipswich, MA, USA). Sequencing was done on Illumina HiSeq 2500 (2 × 150 bp paired end (PE) format, three samples), Illumina HiSeq 2500 v4 (2 × 125 bp PE, 21 samples) and Illumina HiSeq X Ten instruments (2 × 150 bp PE, six samples). Library preparation and DNA sequencing was done at the Research Technology Support Facility of Michigan State University, USA, and at Novogene, Beijing, China. We chose to sequence three samples per treatment to be able to detect sample variance. All data sets from successfully sequenced samples were used for subsequent analysis.The reads were aligned to the chicken (Gallus gallus) reference sequence Galgal4.73 as described. Duplicate reads were removed using samblaster. The aligned reads were realigned with GATK IndelRealigner.Independently arising SNVs and short indels were identified using the IsoMut method developed for multiple isogenic samples. In brief, after applying a base quality filter of 30, data from all samples were compared at each genomic position, and filtered using optimized parameters of minimum mutated allele frequency (0.2), minimum coverage of the mutated sample (5) and minimum reference allele frequency of all the other samples (0.93), and also filtered using a probability-based quality score calculated from the mutated sample and one other sample with the lowest reference allele frequency (, ). The IsoMut code is available for unrestricted download. Structural variations were detected using the CREST algorithm.Ninety-six-triplet signatures were generated after pooling samples of the same genotype and treatment. DT40 triplet signatures were adjusted with the ratio of each triplet occurrence in the human and chicken genome and compared with the 30 human cancer triplet signatures using Pearson correlation coefficient. Two-sided t-tests were used for statistical comparisons of mutation numbers with no adjustments for multiple comparisons, Fisher's exact test was used to compare categorized mutations, and the non-parametric Kolmogorov–Smirnov test was used to compare the size distribution of deletions.Raw sequence data has been deposited with the European Nucleotide Archive under study accession number ERP015181. […]

Pipeline specifications

Software tools SAMBLASTER, GATK
Databases ENA
Application WGS analysis
Organisms Gallus gallus
Diseases Neoplasms
Chemicals Methyl Methanesulfonate