Computational protocol: An innovative strategy for the molecular diagnosis of Usher syndrome identifies causal biallelic mutations in 93% of European patients

Similar protocols

Protocol publication

[…] A multiplex amplicon panel (Fluidigm Access Array) was created to analyze all coding and non-coding exons of the 10 USH genes and the USH2 modifier gene PDZD7. Exons recently identified by retina-specific transcript analysis were included. The amplicons also covered a minimum of 25-bp intronic sequence flanking each exon to facilitate the detection of sequence variants that affect splice sites. The USH2A intronic region harboring the mutation c.7595-2144A>G was also included. The primers were designed based on the design program Primer3., A total of 1268 primer pairs (sequences available on request) were chosen to produce amplicons with an average length of 165 bp. Forty-eight pools of primer pairs were created such that each primer pair was represented twice per assay in independent pools, and each pool of primers contained a unique combination of 47–48 different primer pairs.Following the preparation of the multiplexed amplicon libraries, samples with a minimum of 1 μg of double-stranded DNA, as determined using the SYBR Green I fluorescent double-strand method (Life Technologies, Foster city, CA, USA), were purified using the Agencourt AMPure XP kit (Beckman Coulter Inc., Fullerton, CA, USA). We used an Access Array microfluidic support (Fluidigm, San Francisco, CA, USA) to perform 48 independent PCR reactions in parallel, on 48 different samples at once (ie, a total of 2304 distinct amplicons). We increased the capacity of the device to 110 592 simultaneous PCR reactions per run by optimizing the PCR mix and primer pools to allow multiplexed amplification in each PCR slot. This made it possible to simultaneously produce 2304 amplicons for each of the 48 samples.During the first PCR on the Access Array, a universal tag present at the 5′ end of each primer (Rd1 Tag on the forward primer and Rd2 Tag on the reverse primer) was added to the extremities of each amplicon. Following thermocycling of the Access Array on the BioMark, the Access Array was transferred to the Post-PCR IFC Controller AX (Fluidigm) to recover the 48 pooled PCRs for each sample. The pooled amplicons were then purified with Agencourt AMPure XP beads and subjected to a second round of PCR, using the universal tags added during the first PCR round as templates. Samples from two distinct Access Arrays were processed at once and subjected to six cycles of amplification in a standard microplate format. This second amplification round was used to add a specific identification barcode to each sample, as well as P5 and P7 adapters for sequencing purposes. Each PCR was then controlled on a Fragment Analyzer (AATI, Ankeny, IA, USA), and quantified to create an equimolar pool of the 96 samples. This pool was again purified with AMPure, and loaded onto a Fragment Analyzer or Bioanalyzer (Agilent, Santa Clara, CA, USA) to verify the profile by comparing it with the expected profile. This pool was sequenced on a HiSeq 2000 sequencer (Illumina, San Diego, CA, USA). For each sample, a total of 313 Mb reads were sequenced per 108 kb of analyzed genome, which represents a 2900 × coverage.Raw sequencing data were processed for bioinformatics analysis through the Illumina pipeline (CASAVA1.8.2), using the ELANDv2 algorithm for sequence alignment (multiseed and gapped) and the sequence of each amplicon as reference. Variants were called if they met the following criteria: (1) a read depth superior to five with no ambiguous reading, and (2) an allelic frequency inferior to 0.3% in all the following public variant databases: dbSNP132, Hapmap, 1000 Genomes, Exome Variant Server, Exome Aggregation Consortium (http://exac.broadinstitute.org/), Usher-specific Leiden Open Variation Database(https://grenada.lumc.nl/LOVD2/Usher_montpellier/home.php), and Deafness Variation Database (http://deafnessvariationdatabase.org/). They were then ranked according to their expected negative impact on the resulting gene product. Nonsense variants and small deletions or insertions inducing a frameshift of the coding sequence were considered the most damaging, as they necessarily alter the amino-acid sequence of the protein. The pathogenicity of missense and splice-site variants was estimated using the following prediction algorithms: PolyPhen2, SIFT, and Mutation Taster for missence variants, and NNSplice, ESEfinder, Max Ent Scan, Gene Splicer, and Human Splicing Finder for splice-site variants. From those sequence variants predicted to be highly damaging, pathogenic, and/or disease-causing, candidate variants were chosen if they were biallelic and/or lying within genes matching with the clinical diagnosis. Their presence was confirmed in the patient's and, whenever possible, the parents' DNAs, by Sanger sequencing using standard protocols. The entire process, from library preparation to variant identification, took 3–4 weeks for 48 patients. […]

Pipeline specifications

Software tools Primer3, BaseSpace, PolyPhen, NNSplice, ESEfinder
Databases dbSNP LOVD Exome Variant Server
Applications WGS analysis, WES analysis, qPCR
Organisms Homo sapiens
Diseases Retinitis Pigmentosa, Vestibular Diseases, Genetic Diseases, Inborn