Computational protocol: Identifying Causal Genes at the Multiple Sclerosis Associated Region 6q23 Using Capture Hi-C

Similar protocols

Protocol publication

[…] Capture Hi-C data was produced as part of a larger study targeting all regions associated with four autoimmune diseases (RA, JIA, PsA and T1D) and separately, all promoters within these regions []. Briefly, all promoters within 1Mb of associated SNPs were selected and RNA baits were designed to the ends of all fragments within 500bp of the transcription start sites. Separately, associated regions were defined by SNPs in LD (r2≥0.8) and all restriction fragments not selected for the promoter capture experiment were targeted. Experiments were performed using human T-cell (Jurkat) and B-cell (GM12878) lines. Capture Hi-C libraries were sequenced using 75bp paired-end reads on an Illumina HiSeq 2500. Resulting reads were mapped to restriction fragments and filtered using the Hi-C User Pipeline (HICUP http://www.bioinformatics.babraham.ac.uk/projects/hicup). Chromatin interactions were analysed using CHiCAGO (Capture Hi-C Analysis Of Genomic Organisation [], http://regulatorygenomicsgroup.org/chicago), a publicly available, open-source, bespoke statistical model for detecting significant interactions in Capture Hi-C data at a single restriction fragment resolution. Further filtering was carried out using the BEDTools v2.21.0 pairtobed command to identify significant interactions involving the MS associated regions.Chromatin interactions identified in the Capture Hi-C data were further validated against dense Hi-C data generated by Rao et al. [] in GM12878 cells. No data was available for the Jurkat T-cell line. Raw contact matrices and normalisation matrices for GM12878 cells at 5kb resolution were obtained from GEO accession GSE63525. Observed and expected contact matrices were normalised using the Knight and Ruiz normalisation matrices as described in the accompanying documentation. Observed/expected (O/E) values were calculated and further filtered by O/E ≥5 and normalised read count ≥ 5. BEDTOOLS was used to obtain the overlap of interactions observed in our data and the Rao et al. [] data. […]

Pipeline specifications

Software tools HiCUP, BEDTools
Application Hi-C analysis
Organisms Homo sapiens
Diseases Multiple Sclerosis