Computational protocol: Impact of neonatal iron deficiency on hippocampal DNA methylation and gene transcription in a porcine biomedical model of cognitive development

Similar protocols

Protocol publication

[…] Illumina sequencing resulted in an average of 52.3 million raw reads per sample, ranging from 37.6 to 61.1 million (Additional file : Table S1). Adapters were trimmed from reads using Trim Galore v.0.3.3 [], which also removes experimentally introduced cytosines and filters reads based on minimum quality score (20) and length (20 bp). Trimmed reads were aligned to an in silico converted reduced representation version (20 to 180 bp fragments) of the swine reference genome [] produced using BS-seeker2 v.2.0.5 []. Alignments were performed using BS-seeker2 v.2.0.5, which utilizes Bowtie2 v.2.1.0 [], by adjusting the alignment mode (local), seed length (20), maximum number of mismatches allowed in the seed alignment (1), and maximum number of mismatches/read (2). The final alignments had an average coverage of 23.84 with 1.31 % of the genome covered (Additional file : Table S2). The ratio of unmethylated to total reads for covered cytosines on the mitochondrial genome was calculated to determine the bisulfite conversion rate (99.2 %). The bs_seeker2-call_methylation.py script was used to identify the ratio of methylated/total uniquely aligned reads at each site (methylation levels). Reads aligning to each strand were combined, and the minimum coverage for a site to be utilized for analysis was 10x in all samples.Differentially methylated (DM) sites were calculated using methylKit v.0.9.4 [] after removing sites with high read coverage (upper 99.9th percentile) in order to eliminate bias caused by PCR duplication. Sites were considered DM with a minimum methylation difference of 25 % and a q-value < 0.01, which are program defaults intended to help ensure identified DM sites have potential biological significance. [...] Targeted control datasets were used to assess CpG and non-CpG sites for potential SNPs as previously described []. Illumina sequencing resulted in an average of 15.5 million raw reads per sample, ranging from 12.6 to 18.5 million (Additional file : Table S1). Following trimming as described above, bowtie2 v.2.2.3 was used to perform alignments to the swine reference genome by adjusting the alignment mode (--end-to-end), the –N option (1), and the –L option (20). GATK v.2.3-9-ge5ebf34 [] was used to realign the uniquely aligned reads. The final alignments had an average coverage of 7.17 with 1.28 % of the genome covered (Additional file : Table S2). GATK v.3.3-0 was used to perform variant calling by adjusting the –stand_call_conf option (50), the –stand_emit_conf option (20), and the –dcov option (200). Read depths used to call SNPs ranged from 1/3 to 2 times the average coverage (minimum depth of 4 reads), and taking the mapping quality (minimum of 20) into account. These sites were removed from the RRBS dataset before analysis. [...] Illumina sequencing resulted in an average of 35.3 million raw paired-end reads per sample, ranging from 29.7 to 42.8 million (Additional file : Table S1). Sequential trimming for adapter contamination and A-tails, as well as minimum quality score (20) and length (20 bp) was performed as described above. While a minimum length of 20 bp was used for paired reads, a minimum length of 35 bp was used for unpaired reads. Tophat v.2.2.10 [] was utilized to perform alignment to the swine reference genome as previously described []. Reads with more alignments than the maximum allowable number (20) were filtered out using the –M option. The remaining reads were aligned to the Ensembl swine reference transcriptome (-G) followed by alignment to the genome by adjusting the –read-realign-edit-dist option (0), the –mate-inner-dist option (120), the –mate-std-dev option (260), and by indicating the method used to perform the stranded library preparation (fr-firststrand). Cufflinks v.2.2.1 [] was used to perform differential gene expression analysis as previously described []. Transcripts were assembled using cufflinks using the fr-firststrand option, after which transcripts from all samples were merged with the reference transcripts using Cuffmerge. Gene expression was pre-computed using Cuffquant by including the –u and fr-firststrand options. Finally, differential expression analysis was performed using Cuffdiff by including the –u and fr-firstrand options, and genes with a q < 0.05 were considered differentially expressed genes (DEGs). […]

Pipeline specifications

Software tools Trim Galore!, BS Seeker, Bowtie2, methylKit, GATK
Application BS-seq analysis
Diseases Hypertension, Disorders of Sex Development, Corneal Neovascularization, Anemia, Iron-Deficiency
Chemicals Iron