A web application for identifying promoter regions and annotating regulatory features in user-input sequences. The GPMiner system has a gene group analysis function for analyzing the co-occurrence of TFBSs with statistical measures in a set of co-expressed genes. This function uses a practical platform to examine co-expression genes of microarray data in transcriptional regulation networks. Furthermore, the GPMiner system has a user-friendly input/output interface, and has numerous advantages in mammalian promoter analysis. The proposed system incorporates an SVM with nucleotide composition over-represented hexamer nucleotides and DNA stability for mammalian proximal promoter identification and mines regulatory elements, including TSSs, TFBSs, CpG islands, tandem repeats, the TATA box, CCAAT box, GC box, statistically over-represented sequence patterns, GC content (GC%) and DNA stability. Evaluated by benchmark cross-validation, the predictive sensitivity and specificity of GPMiner are roughly 80%.
Department of Computer Science and Engineering, Yuan Ze University, Taoyuan, Taiwan; Institute of Tropical Plant Sciences, National Cheng Kung University, Tainan, Taiwan; Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin-Chu, Taiwan; Department of Multimedia and Game Science, Asia-Pacific Institute of Creativity, Miao-Li, Taiwan