Computational protocol: NLRC5 Exclusively Transactivates MHC Class I and Related Genes through a Distinctive SXY Module

Similar protocols

Protocol publication

[…] Chromatin was purified from MACS-sorted WT (C57BL/6), Nlrc5 F/F, Nlrc5 −/−, and Rfx5 −/− T cells as described []. Five mice were pooled for each genotype. Chromatin immunoprecipitation was performed using anti-NLRC5 antibody as described [].Immunoprecipitated DNA was sequenced using the Illumina HiSeq 2000 platform. >300 million reads were obtained for WT samples. >20 million reads were obtained for all other samples. ChIP samples from WT and Nlrc5 F/F mice were used as biological repeats. Five pseudo-replicates of 30 million reads each were used for the WT data set, as proposed by the ENCODE consortium []. Reads were mapped to the mouse genome (release GRCm38.70) using Bowtie 0.12.7 []. Only reads mapping to unique genomic positions were considered for further analysis.Fragment length was estimated using cross-correlation []. The Phantompeakqualtools R package (https://www.encodeproject.org/search/?type=software&used_by=ENCODE&software_type=quality%20metric) [] was used to measure the quality of the ChIP-seq data, as assessed by the normalized ratio between the fragment-length cross-correlation and the background cross-correlation (normalized strand coefficient, NSC), the ratio between the fragment-length peak and the read-length peak (relative strand correlation, RSC) and the Qtag code. The low NSC scores obtained (< 1.05) () are a consequence of the low number of peaks []. The RSC (> 1.51–1.85) and Qtag (2, high quality) scores obtained attest to the quality of the ChIP-seq peaks () [].Peak calling for WT and Nlrc5 F/F data sets was first done with MACS2 using the default settings (q-value threshold of 0.05 and without the “–to-large” parameter). This led to the identification of a surprisingly low number of reproducible peaks. The numbers of peaks were 6 and 11, respectively for the WT and Nlrc5 F/F datasets. The low number of peaks called using the initial strategy prompted us to use second strategy based on using a lower peak calling stringency followed by Irreproducible Discovery Rate (IDR) analysis. This was done to ascertain that that the low number of peaks identified by our initial procedure was not in fact an artifact resulting from overly-stringent peak selection. Peaks were called using MACS2 2.0.10.20130520 [] with no-model setting and shift-size parameter set to half of the estimated fragment length. Peak calling stringency was decreased by using p = 0.001 as threshold and applying the “-to-large” setting. Reads obtained from Nlrc5 −/− samples were used as negative control for peak calling. Reproducible peaks were obtained by assessing the IDR for all pairs of pseudo-replicates using a threshold of 0.01 (). Only 11 reproducible peaks were obtained, all of which were confirmed in the biological repeat (Nlrc5 F/F) but found to be absent in the Rfx5 −/− and Nlrc5 −/− samples. These 11 peaks were the same as those identified in the Nlrc5 F/F dataset with the first peak identification strategy.The Fraction of Reads in Peaks (FRiP) [] was also calculated (). The low FRiP values obtained (<1%) are consistent with the low number of peaks identified []. [...] The following amino-acid sequences were downloaded from MGI (http://www.informatics.jax.org/): H2-Ke2 (MGI:95908), H2-K1 (MGI:95904), H2-Ke6 (MGI:95911), H2-Oa (MGI:95924), H2-DMa (MGI:95921), H2-DMb2 (MGI:95923), H2-DMb1 (MGI:95922), H2-Ob (MGI:95925), H2-Ab1 (MGI:103070), H2-Aa (MGI:95895), H2-Eb1 (MGI:95901), H2-Eb2 (MGI:95902), H2-D1 (MGI:95896), H2-Q1 (MGI:95928), H2-Q2 (MGI:95931), H2-Q4 (MGI:95933), H2-Q6 (MGI:95935), H2-Q7 (MGI:95936), H2-Q10 (MGI:95929), H2-T24 (MGI:95958), H2-T23 (MGI:95957), H2-T22 (MGI:95956), H2-T17 (MGI:95949), H2-M10.1 (MGI:1276522), H2-T10 (MGI:95942), H2-T3 (MGI:95959), H2-M10.2 (MGI:1276525), H2-M10.4 (MGI:1276527), H2-M1 (MGI:95913), H2-M9 (MGI:1276570), H2-M10.3 (MGI:1276524), H2-M11 (MGI:2676637), H2-M10.5 (MGI:1276526), H2-M5 (MGI:95917), H2-M3 (MGI:95915), H2-M2 (MGI:95914), Mill1 (MGI:2179988), Cd1d1 (MGI:107674), B2m (MGI:88127), Mr1 (MGI:1195463), Azgp1 (MGI:103163), Mill2 (MGI:2179989), Fcgrt (MGI:103017), Cd1d2 (MGI:107675), H2-Q8 (MGI:95937), H2-Q9 (MGI:95938) and H2-T9 (MGI:95965). Alignment was performed using the Muscle tool [], the best model to construct the phylogenetic tree was assessed using Prottest [], and the phylogenetic tree was constructed in PhyML [] using the JTT substitution model. […]

Pipeline specifications

Software tools MUSCLE, ProtTest, PhyML
Applications Phylogenetics, Nucleotide sequence alignment
Organisms Mus musculus