Computational protocol: Combining ATAC-seq with nuclei sorting for discovery of cis-regulatory regions in plant genomes

[…] Sequencing reads and Col-0 wild-type control DNase-seq () were mapped to Release 10 of the Arabidopsis Genome (TAIR10) using bowtie1 with parameters ‘-v 2 -m 3΄ () (). Duplicated reads were removed using the default parameters of picard. Accessible regions and peaks were identified using the default parameters of HOTSPOT (). The center of identified peaks was used to define peak overlaps with genomic features using the following criteria. If a center site is located in () the promoter of a gene (2000 bp upstream from the transcriptional start site (TSS)), or () gene body, the peaks will be assigned to that gene. The distal intergenic regions refer to regions >3 kb from the TSS and >1 kb from the transcriptional end site (TES). The plot and heatmap of regulatory region distribution were obtained using ChIPseeker (). Footprints were identified with pyDNase () using the following parameters ‘-fp 4,30,1 –dm –A’. The identified footprints were extracted and compared with the PBM database () using find individual motif occurrences (). If the entire sequence of the motif was contained within a footprint it was assigned to that motif. […]

