Computational protocol: Recombination Is Responsible for the Increased Recovery of Drug Resistant Mutants with Hypermutated Genomes in Resting Yeast Diploids Expressing APOBEC Deaminases

[…] Whole-genome and RNA sequencing were performed using the Illumina platform, as described (). We have sequenced the whole genomes of 66 yeast clones (see Supplementary Table , column “Reference”). For analysis of SNV loads and recombination events, we have combined these results with the previously obtained sequencing data (). A summary of sequenced and analyzed clones is in Supplementary Table . Raw reads processing, alignment to the reference genome, and SNV call were performed generally as described () with the following modifications. First, the terminal Ns were removed from the reads before processing. Second, we have used updated and improved version of genome mask instead of filtration associated with clustering to improve the quality of SNV call. Third, HaplotypeCaller from new version of GATK (3.8-0) was used to call the variants. Forth, a more rigorous filtration for the strand bias artifacts (only SNVs with FS ≤ 20 were retained in final vcf files) was applied, as we discovered that most of the false-positive SNVs have shown a high strand bias. This modified pipeline was used to process all samples, including the clones sequenced and reported before (). The final SNV results (vcf files) are in Supplementary Data Sheet.We observed higher numbers of false-positive calls in clones with lower overall sequence coverage. Some clones were considered homozygous based on the manual examination of alignments that confirmed false-positivity of few heterozygous SNVs in these clones. The final ∗.vcf files for such clones, however, has not been edited, for the sake of uniformity of the results.RNA sequencing data was analyzed by the standard pipeline using Top Hat, with a follow-up by Cufflinks, while using our own reference genome and annotations (). To analyze PmCDA1 expression, sequence of pESC-LEU-PmCDA1 expression plasmid was added to the reference genome file, and annotations updated accordingly.Visualization of NGS alignments and results of SNV calls for both pipeline optimization and figure preparation were done using Geneious releases R6–R10 (Biomatters) and Integrative Genomics Viewer (IGV) 2.3 (Broad Institute). Statistical analysis and the creation of graphs were performed in GraphPad Prism v. 5 and R. […]

Pipeline specifications

Software tools GATK, Cufflinks, Geneious, IGV
Application RNA-seq analysis
Organisms Saccharomyces cerevisiae, Petromyzon marinus