Computational protocol: Identification of novel vascular targets in lung cancer

Similar protocols

Protocol publication

[…] Seventy-nine and 84 million paired end reads (50 bp+35 bp) were sequenced on the SOLiD4 2nd generation sequencer (Applied Biosystems, Foster City, CA, USA) for endothelium from fresh tumour and healthy tissue, respectively. Reads were mapped to the Human genome (University California Santa Cruz, version hg19) with Tophat 1.3.3 (). Default parameters for colour space mapping were used with the exception of the following: 1 g/—max-multihits was set to 1 to identify the single best mapped read; 2 library-type was set to fr-secondstrand to reflect the sequencing library preparation; 3 G provided Tophat with a model set of gene annotation genome positions from the Refseq hg19 transcriptome. The Tophat output bam files were sorted using samtools (Version: 0.1.8, ()), and 'HTSeq-count' version 0.4.7p4 () was used, in conjunction with the Human transcriptome GTF Refseq version 19, to assign gene counts and produce a tab delimited file of transcript/gene counts. Differential gene expression analysis and P-value generation on the count data was carried out using the R Bioconductor package DESeq v1.5 (). […]

Pipeline specifications

Software tools TopHat, SAMtools, HTSeq, DESeq
Application RNA-seq analysis
Organisms Homo sapiens
Diseases Lung Neoplasms, Neoplasms