Computational protocol: Integrated analysis of microRNA and mRNA expression and association with HIF binding reveals the complexity of microRNA expression regulation under hypoxia

Similar protocols

Protocol publication

[…] Small RNA sequencing data were processed from raw FASTQ files and analysed using the Kraken pipeline (http://www.ebi.ac.uk/research/enright/software/kraken) []. The 3’ adaptors were stripped using the criteria: 12-nt alignment stretch with no more than 2 mismatches and no gaps. The reads were then filtered for low-complexity regions and the remaining reads were size-selected for 18-26nt, resulting in inclusion of more than 75% of reads on average for each sample (Additional file : Table S1). A file with unique reads and their corresponding counts was generated for each sample. The final processed unique reads were mapped to the human genome (hg19 based on Ensembl v64 []) using Bowtie 0.12.7 [] allowing for two mismatches and using the best alignment stratum option. Reads mapped to more than 20 loci were discarded. Mapped reads were classified and counted based on genomic annotations (RNAs, coding genes, pseudogenes and repeats) obtained from Ensembl v64 (Additional file : Table S1). For reads mapping to multiple loci, counts were divided by total number of reads. The sense and antisense distribution of mapped reads across all chromosomes was also analysed.For differential expression of microRNAs, the overlap between mapped reads and human mature microRNAs (based on miRBase v17 []) was found using the function findOverlaps available in the Bioconductor GenomicRanges package (http://bioconductor.org/packages/2.6/bioc/html/GenomicRanges.html). The normalization and differential expression was performed with Bioconductor edgeR package v2.3.57 [] using both common and tagwise dispersion. The significant differentially expressed microRNAs were determined by an adjusted p value lower than 0.05 based on the Benjamini and Hochberg multiple testing correction []. Data has been submitted to GEO (reference series: GSE47534; miRNA-seq: GSE47602). [...] Average signal was background subtracted with local background subtraction (BeadStudio) and consolidated per gene. Data was quantile normalized in Bioconductor (http://www.bioconductor.org) and filtered based on gene detection level using the pvalDet parameter. Only genes with pvalDet lower than 0.05 in at least 2 replicates of any of the experimental condition were considered for further analysis. Limma analysis was then performed to assess independently the effects of 16 h, 32 h and 48 h of hypoxia exposure compared to normoxic control in gene expression levels in MCF-7 cells. Benjamini and Hochberg method was used to correct for multiple testing []. Genes with adjusted p values lower than 0.05 were considered significantly regulated. Data has been submitted to GEO (reference series: GSE47534; mRNA: GSE47533).Datasets from published studies (presented in Additional file : Table S10) have been analysed as described above. [...] MicroRNA expression was assessed by qPCR with TaqMan microRNA assay protocol (Applied Biosystems, Life Technologies, Carlsbad, California, USA) using 5 ng of total RNA per microRNA as indicated by the manufacturer. QPCR was done in a CFX96 real-time PCR detection system (BioRad, Hercules, California, US). Cycling conditions included a pre-incubation step (95°C for 3 minutes) and 40 cycles of amplification (95°C for 30 seconds, 60°C for 1 minute). For gene expression, total RNA (2 μg) was reverse transcribed using the SuperScript II Reverse Transcription kit, 50 ng of random primers (both from Invitrogen, Life technologies, Carlsbad, California, USA) and 0.5 mM dNTP mixture (Bioline, London, UK). QPCR was done in a CFX96 real-time PCR detection system using 30 ng of cDNA, 1X iQ SYBR-green supermix (Biorad, Hercules, California, US) and primers at a final concentration of 2 mM. Cycling conditions included a pre-incubation step (95°C for 3 minutes), 40 cycles of amplification (95°C for 30 seconds, annealing temperature for 30 seconds and 72°C for 30 seconds) and a melting curve ( 95°C for 1 minute, 55°C for 1 minute, denaturation from 55°C to 95°C at 0.5°C/10 seconds increments). Each reaction was done in triplicate. All primers used are listed in Additional file : Table S8.Expression values were normalised to the geometrical mean of housekeeping genes RPL11, RPL30 and RPS6 and fold-changes between treatments and controls were determined by the 2-ΔΔCt method [] as implemented in the SLqPCR Bioconductor package (http://www.bioconductor.org/packages/2.12/bioc/html/SLqPCR.html). The variance between sample groups was assessed through the Barlett test for variance. If variances could be assumed as equal between groups, significant differences were established using ANOVA followed by pairwise t-test. If variances could not be assumed as equal, significant differences were then assessed using a one way test for equal means followed by pairwise t-test. After multiple test correction by Benjamini and Hochberg method [], results were considered significant at two levels: with adjusted p value lower or equal to 0.05 and with adjusted p value lower or equal to 0.08. All analysis was done using R v9 (http://www.r-project.org). […]

Pipeline specifications

Software tools limma, ddCt, SLqPCR
Applications Gene expression microarray analysis, qPCR
Organisms Homo sapiens
Diseases Breast Neoplasms
Chemicals Oxygen