Computational protocol: In Search of Epigenetic Marks in Testes and Sperm Cells of Differentially Fed Boars

Similar protocols

Protocol publication

[…] The microarray analysis was performed as described by Braunschweig et al. . RNA from testes tissue samples was extracted using the Trizol reagent according to the manufacturer's protocol. The integrity of RNA was confirmed by a Bioanalyzer 2100 (Agilent, Waldbronn, Germany). The RNA was labeled and hybridized to the porcine gene expression microarray from Agilent Technologies according to standard protocol used at the Functional Genomic Center Zürich. We used the Porcine (V2) Gene Expression Microarray, 4644K (G2519F). Spot intensities that were obtained from the hybridization of the samples to the probes were extracted from the TIFF images using Agilent Feature Extraction Software 9.5. From the generated TXT files the ‘‘gMedianSignal’’ of the spots was used as raw expression value and further analyzed using R/Bioconductor. A signal of probe was declared present in a condition if it had a linear signal value above 25 and if the flag ‘‘gIsWellAboveBG’’ generated by the Feature Extraction software was true in at least 50% of the replicates of that condition. False Discovery Rates were computed using the Benjamini-Hochberg method. We used 4 testes tissue samples from F0 boars that received the diet enriched in methylation micronutrients and 4 samples from the control boar group.All probes from the microarray experiment that had a P-value less than 0.05 for the difference between signal averages of the two groups were manually annotate and analyzed using the GeneGO MetaCore pathway analysis software (db version 6.2, build 24095, http://www.genego.com/metacore.php). The software interconnected all candidate genes according to published literature-based annotations. Only direct connections between the identified genes were considered. In MetaCore analysis, the statistical significance of networks is indicated by a P-value from the Fisher's exact test. The false discovery rate (FDR) is used for multiple testing corrections. [...] The quality of the sequence reads was assessed using fastQC version 0.9 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). We observed a median phred score for each base between 30 and 40 across all samples. Ribosomal RNA contaminations were removed by mapping the 100bp single end reads to a collection of ribosomal RNA sequences using bowtie2 with standard parameters . The fraction of rRNAs was between 53% and 63%. The remaining reads were mapped to the Sus scrofa genome assembly Ssc10.2/Ssr3 (http://hgdownload.cse.ucsc.edu/goldenPath/susScr3/bigZips/susScr3.fa.gz) using TopHat2 with default parameters . Because the duplication rate was very high we decided to remove these duplicates using picard tools (http://picard.sourceforge.net), although this clearly underestimates the number of sequenced molecules. The high number of duplicates can be most probably attributed to the limited amount of input RNA available for the sequence library preparation, which required more PCR cycles than usual.We combined the mapping files (bam files) of all samples for search of expressed regions. An expressed region is defined by a minimum length of 50 bp and a minimum average coverage of 5. These regions were combined with the preliminary annotation of the Sus scrofa genome and used to count reads with htseq_count (http://www-huber.embl.de/users/anders/HTSeq). Differential expression values were calculated using DESeq bioconductor library . We compared the RNASeq results from two sperm RNA samples in each group of the supplemented and the control diet group for differential gene expression.Similar to the microarray expression data we selected all genes that were differentially expressed in sperm cells between the two pairs of samples from the feeding experiment. We used transcripts that were differentially expressed on the nominal P>0.05 level between the groups and annotated them manually and used them to perform the enrichment analysis as described above.The detailed results of the expression analysis in testes and sperm cells discussed in this publication have been deposited in NCBI's Gene Expression Omnibus and are accessible through GEO Series accession number GSE48778 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?&acc=GSE48778) . […]

Pipeline specifications

Software tools FastQC, Bowtie2, TopHat, Picard, HTSeq, DESeq
Databases GEO
Application RNA-seq analysis
Organisms Sus scrofa
Diseases Bacterial Infections, Cystic Fibrosis