Computational protocol: A CRISPR screen identifies genes controlling Etv2 threshold expression in murine hemangiogenic fate commitment

[…] Total RNA was extracted using the RNeasy Mini kit (Qiagen). Library preparation was performed with 10 ng of total RNA (GTAC core facility, Washington University in St. Louis): Total RNA integrity was validated using an Agilent bioanalyzer. All samples were prepared using the Ribozero (Epicentre) kit as per the manufacturer’s protocol. cDNA was then blunt ended, an A base added to the 3′ ends, and then Illumina sequencing adapters were ligated to the ends. Ligated fragments were then amplified for 12 cycles using primers incorporating unique index tags. Fragments were sequenced on an Illumina HiSeq-2500 using single reads extending 50 bases. RNA-seq reads were aligned to the mouse mm9 assembly from the UCSC Genome Browser with Tophat2. Gene counts were derived from the number of uniquely aligned unambiguous reads by Subread:featureCount version 1.4.5. All gene-level counts were then imported into the R/Bioconductor package EdgeR normalized to adjust for differences in library size. Genes with RPKM <5 in all samples were excluded from further analysis. Genes with of less than fivefold changes in any two samples were also excluded. Finally, 3815 genes were used for further presentation. Published data used in RNA-seq analysis: GSE57409 (sample “ES” and “EpiSC”), GSE36114 (sample “endo_d6”, representing ES cell for endoderm differentiation on day 6), GSE69080 (“mes”, “hb”, “hp”), GSE55310 (“he”), 7R2 (, sample “cp”, representing ES cells differentiated to cardiac progenitors). The RNA-seq data have been deposited in the NCBI Gene Expression Omnibus (GEO) database under accession code GSE85641. […]

Pipeline specifications

Software tools TopHat, Subread, edgeR, GNomEx
Databases GEO UCSC Genome Browser
Application RNA-seq analysis
Organisms Mus musculus