Similar protocols

Pipeline publication

[…] not penalize harshly when Lg deviates two to three times from ., We first performed isoform-level clustering using the RS_IsoSeq protocol within the SMRTPortal software. This protocol essentially performs isoform-level clustering (ICE) and polishes the results with Quiver. The output from ICE algorithm contains consensus sequences from full-length reads. The Quiver polished output is classified into either ‘low QV' or ‘high QV'. Our analysis focused on the high-QC consensus isoform clusters, where ‘Quiver high QV' is currently set with an expected consensus accuracy of 99%. Once we obtained the high-quality consensus clusters, we further aligned them to the GRCh38 reference genome using the GMAP algorithm. To improve Iso-Seq read alignment, we further performed error correction of all original Iso-Seq reads using LSC, following similar steps in its original publication. LSC is an algorithm designed for improving PacBio long-read accuracy by short-read alignment from Illumina RNA-Seq. Alignment and analysis of short-read RNA sequencing data was performed by the TopHat software and Cufflinks software, respectively. The fragments per kilobase of transcript per million mapped reads (FPKM) measure was used for quantification of gene expression in the short-read sequencing data. Comparison of transcript models was performed by the CuffCompare software within the Cufflinks package. We validated several novel transcripts with more than two predicted exons, by designing pairs of PCR primers that are located in two adjacent exons, and performed PCR reactions on the cDNA samples. The gel bands were cut and DNA was recovered by Qiagen QIAquick kit (Valencia, CA, USA), and used for Sanger sequencing., Detection of structural variation on short-read sequencing data, long-read sequ […]

Pipeline specifications

Software tools GMAP, LSC, TopHat, Cufflinks