[…] ether, these data allowed us to identify primary let-7 transcripts, based on their expression in our Chromatin-associated RNA-seq samples and in DGCR8-/- RNA-seq samples, even when they disagreed with RefSeq-annotated MIRLET7 genes. Hypothesized regulatory regions were assembled by searching 20 kilobases upstream and downstream of each transcript for colocalization of H3K27Ac, H3K4me3, and DNAse sensitivity in samples known to express let-7 primary transcripts, and H3K27me3 or H3K9me3 in samples without appreciable primary let-7 transcripts. These regions were frequently annotated as Active TSS, Flanking active TSS, Bivalent/Poised TSS, Enhancer, Genic Enhancer, and Bivalent Enhancer in the ChromHMM chromatin state prediction algorithm performed on Roadmap datasets., These hypothesized regulatory regions were then queried for transcription factor binding sites, based on known and predicted TF binding sites and motifs. We used both the ORCA Toolkit web server and the MEME suite of motif analysis applications to isolate highly conserved (>80% phastCons score) sub-regions within these hypothesized regulatory regions, and then queried those conserved sub-regions for TF motifs. These motifs were assembled from the JASPAR motif database as well as a small list of manually curated motif sequences. Where possible, experimental ChIP-Seq validated TF binding sites from ENCODE and ROADMAP dataset were used as secondary validation of predicted TF binding sites. Accession numbers for these datasets are also found in Tables and ., We highly appreciate the outstanding technical support given to this work by Jessica Cinkornpumin. This work was funded by the Nationa […]

Software tools ChromHMM, ORCAtk, MEME Suite