Computational protocol: A molecular atlas of the developing ectoderm defines neural, neural crest, placode, and nonneural progenitor identity in vertebrates

Similar protocols

Protocol publication

[…] Samples were selected for RNA library preparation if they met the following 3 requirements: (1) a ratio of the ribosomal 18S over 28S greater than 1.7-fold on bioanalyzer total RNA traces, (2) a confirmed tissue identity, and (3) a total RNA amount greater than 40 ng. RNA-seq libraries were prepared using the TruSeq Stranded mRNA library preparation kit (Illumina), starting with 40–300 ng of total RNA and sequenced on a HiSeq 2000 (Illumina) with a target of 15–20 million 100 bp paired reads per sample. Reads were mapped on the X. laevis genome (9.1) [] using tophat 2.0.14 [], with the following parameters (—rg-library Illumina—rg-platform Illumina—keep-fasta-order -N 6—read-gap-length 6—read-edit-dist 6—segment-mismatches 3 -i 5 -I 500000 -r 155—mate-std-dev 80—no-coverage-search -p 6—library-type fr-firststrand -g 2). 92% of the reads mapped on X. laevis genome. Paired reads mapping multiple times on the genome (approximately 4% of them) were excluded from the downstream analysis, leaving 88% of the reads for analysis. Putative transcript models were predicted using cufflinks 2.2.1 with the following parameters (—library-type fr-firststrand —3-overhang-tolerance 0—intron-overhang-tolerance 0), then merged with the annotation of the X. laevis genome (annotation version 1.8 of the 9.1 version of the genome, Xenbase ftp://ftp.xenbase.org/pub/Genomics/JGI/Xenla9.1/) using cuffmerge v2.2.1 []. Transcript annotation was done with a custom script BLASTing transcript nucleotide sequences against the X. tropicalis transcript nucleotide sequence database (JGI 4.2) [] retaining only the first hit for each transcript with an E-value lower than 1E-100. Gene models and annotations were computed by merge overlapping transcript models and their annotation using BedTools [] and a custom Perl script. Read counts were computed on each gene model using HTSeq []. Only genes with at least 10 counts in at least 2 samples were further considered. Read counts were then normalized using library size with trimmed mean of M-value normalization [105], as implemented in EdgeR [], then transformed to log2-CPM (counts per million), and processed by the limma/voom package [] to estimate the mean-variance relationship of the log-counts for subsequent statistical analysis. The data discussed in this publication have been deposited in NCBI's Gene Expression Omnibus [] and are accessible through GEO Series accession number GSE103240 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE103240). […]

Pipeline specifications

Software tools TopHat, Cufflinks, BEDTools, HTSeq, edgeR, limma
Databases Xenbase
Application RNA-seq analysis
Diseases Neoplasms, Maxillofacial Abnormalities