A predictor for identifying the location of DNase I hypersensitive sites (DHSs) in human genome. iDHS-EL was formed by fusing three individual RF (Random Forest) classifiers into an ensemble predictor. The three RF operators were respectively based on the three special modes of the general pseudo nucleotide composition (PseKNC): (1) kmer, (2) reverse complement kmer, and (3) pseudo dinucleotide composition. It has been demonstrated that the new predictor remarkably outperforms the relevant state-of-the-art methods in both accuracy and stability.
School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China; Key Laboratory of Network Oriented Intelligent Computation, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China; Gordon Life Science Institute, Belmont, MA, USA; Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah, Saudi Arabia
iDHS-EL funding source(s)
This work was supported by the National Natural Science Foundation of China (No. 61300112, 61573118 and 61272383), the Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry, and the Natural Science Foundation of Guangdong Province (2014A030313695), Shenzhen Foundational Research Funding (Grant No. JCYJ20150626110425228), and Development Program of China (863 Program) [2015AA015405].