A manually curated repository of cancer genes derived from the scientific literature. NCG release 5.0 (August 2015) collects 1571 cancer genes from 175 published studies that describe 188 mutational screenings of 13,315 cancer samples from 49 cancer types and 24 primary sites. In addition to collecting cancer genes, NCG also provides information on the experimental validation that supports the role of these genes in cancer and annotates their properties (duplicability, evolutionary origin, expression profile, function and interactions with proteins and miRNAs).
Is an improved functional gene network for laboratory mouse, Mus musculus, which is the choice for many biomedical researches. To improve the previous version of MouseNet, a large volume of new microarray data derived from diverse biological contexts has been incorporated. We have also continued to improve machine learning algorithms to infer co-functional links from genomics data. MouseNet v2 now covers 88% of coding genomes with higher accuracy. All of the functional gene networks are released for free and can be searched using the MouseNet v2 web server, which offers a useful resource for mouse, human and other vertebrate genetics.
Reports altered genes in oral cancer. OrCGDB is designed to be optimal for utilization of the information for diagnosis, prognosis and treatment. It has been used to predict the possible role of differentially expressed markers in cell transformation. The database provides the scientist information and external links for the genes involved in oral cancer, interactions between them, and their role in the biology of oral cancer along with clinical relevance.
A probabilistic functional gene network for baker's yeast, Saccharomyces cerevisiae, which has been a major model organism for eukaryotic genetics and cell biology. YeastNet v3 provides a new web interface to run the tools for network-guided hypothesis generations. YeastNet v3 also provides edge information for all data-specific networks (approximately 2 million functional links) as well as the integrated networks. Therefore, users can construct alternative versions of the integrated network by applying their own data integration algorithm to the same data-specific links.
An evolutive and interactive database of flowering time genes. The hand-curated database contains information on 306 genes and links to 1595 publications gathering the work of >4500 authors. Gene/protein functions and interactions within the flowering pathways were inferred from the analysis of related publications, included in the database and translated into interactive manually drawn snapshots. The content of the FLOR-ID database could be easily incremented with modules expanding beyond flowering time, for example to gametophyte development.
An accurate and high-resolution atlas of gene expression and gene co-regulation in human retina. We collected 50 high-quality post-mortem human retinas from donors and performed high-coverage RNA-sequencing analysis to yield a comprehensive RefT of the human retina. Moreover, we exploited inter-individual variability in gene expression to infer a gene co-expression network and to predict, via a guilty-by-association approach, photoreceptor-specific expression of 253 genes. This atlas represents a valuable resource for the research community at large and help in better elucidating pathophysiological processes in the human retina.
An automatically collected database of gene lists, which were reported mostly by experimental studies in various biological and clinical contexts. At the moment, the database covers 3369 gene lists extracted from 2644 papers published in approximately 80 peer-reviewed journals. As input, CCancer accepts a gene list. An enrichment analyses is implemented to generate, as output, a highly informative survey over recently published studies that report gene lists, which significantly intersect with the query gene list. A report on gene pairs from the input list which were frequently reported together by other biological studies is also provided.
A database of soybean co-functional networks and a companion web tool for network-based functional predictions. SoyNet maps 1 940 284 co-functional links between 40 812 soybean genes (covering 72.8% of the coding genome), which were inferred from 21 distinct types of genomics data including 734 microarrays and 290 RNA-seq samples from soybean. SoyNet provides a route to functional investigation of the soybean genome, elucidating genes and pathways of agricultural importance.
Allows various functional genomics analyses. ADEPTUS offers four different types of analysis: (1) analysis of a gene list, (2) analysis of a disease or a tissue, (3) analysis a profile to predict cancer site from mutated genes and ultimately (4) analysis to predict phenotype from expression. This database contains more than 38000 gene expression profiles and more than 100 diseases.
Offers numerous functional interactions and extensive poplar gene functional annotations. PoplarGene is an online database that integrates two network-assisted gene prioritization algorithms, neighborhood-based prioritization and context-based prioritization. This resource can be used to perform gene prioritization and to identify genes underlying traits in a complementary manner. Moreover, it can be utilized for other woody plant proteomes via orthology transfer using two optional orthology mapping algorithms.
A database of cancer gene networks estimated from the publicly available cancer gene expression data. TCNG allows to estimate genome-wide gene networks consisting of more 20,000 genes from gene expression data using nonparametric Bayesian networks. The gene networks are estimated using the Japanese national flagship supercomputer "K computer". This is a result of ISLiM project which aims at developing biological software that utilizes "K computer".
A platform for genome functional annotations and multi-dimensional network analyses in Sorghum (Sorghum bicolor [L.] Moench). SorghumFDB encompassed most information, such as various annotations of whole genome assemblies, miRNA sequences and target genes, common gene families, network constructions using transcriptome data, PPI data and miRNA-target pairs, as well as multiple gene function annotation elements. Visualization tools (Gbrowse, Cytoscape and open-flash-chart) and four analysis-based tools, BLAST, GSEA, motif significance analysis and pattern set, were provided to determine the functional prediction.
A probabilistic functional gene network for Escherichia coli, which is an intensively studied species of bacteria, due to its utility in both exploring the molecular mechanisms underlying fundamental biological processes and manufacturing useful metabolites for the biomedical industry. All integrated cofunctional associations can be downloaded, enabling orthology-based reconstruction of gene networks for other bacterial species as well.
Permits users to access the integrated database of the Genome Network Project. GNP_Y2H is a system that integrates experimental data generated from the project in association with the public databases.
A comprehensive knowledgebase for pathway analysis in mouse. Interpretation of high-throughput genomics data based on biological pathways constitutes a constant challenge, partly because of the lack of supporting pathway database. GSKB is a functional genomics knowledgebase in mouse, which includes 33261 pathways and gene sets compiled from 40 sources such as Gene Ontology, KEGG, GeneSetDB, PANTHER, microRNA and transcription factor target genes, etc. In addition, 8747 lists of differentially expressed genes from 2526 published gene expression studies were manually collected and curated to enable the detection of similarity to previously reported gene expression signatures. These two types of data constitute the comprehensive Gene Set Knowledgebase (GSKB), which can be readily used by various pathway analysis software such as gene set enrichment analysis (GSEA).
Provides a set of constructed insect pathways. iPathDB is searchable by keywords for species, pathway ID and pathway name. It returns some gene sequences, annotations and a pathway map. The database contains insect pathways generated by iPathCons, a pipeline that uses official gene sets (OGSs) or transcriptomes to proceed. The aim of the database is to employ insect pathways to model human diseases.
Gene fusion detection in Plants
Fusion transcripts (i.e., chimeric RNAs) resulting from gene fusions are well known in case of human. But, in plants, this phenomenon is not yet explored. We are planning to discover the fusion transcripts/gene fusions in different type of plants by using RNA-Seq datasets. Further, we are planning to understand the mechanism of gene fusion formation and significance of fusions in plants.
Whole genome and transcriptome sequencing data analysis of Plants
In this era of Next Generation Sequencing (NGS), there is huge amount of sequencing data available in the public domain. Any novel finding from these available datasets is major challenge for a computational biologist. We are interested in the analysis of whole genome and transcriptome sequencing data of different plants to fetch out the useful information from those datasets, with the help of bioinformatics tools. Currently, we are planning to study the gene clusters of secondary metabolite pathways in different plants.
Development of webservers, databases and computational pipelines for plant research
Development of database is necessary to compile and share the information with scientific community. We are dedicated to develop useful databases and webserver for plant research.
Another area of interest is to develop automated pipelines and tools for the analysis of high throughput genomics data, generated by NGS technologies.
Professional & Academic Background
Staff Scientist II (May 2017- present): National Institute of Plant Genome Research (NIPGR), New Delhi, India
Postdoctoral Research Associate (2015-2017): University Of Virginia, Charlottesville, VA, USA
Research Scientist (2014-2015): Sir Ganga Ram Hospital, New Delhi, India
PhD Bioinformatics (2009-2014): Bioinformatics Centre, Institute of Microbial Technology (IMTECH), Chandigarh under Jawaharlal Nehru University (JNU), New Delhi, India
M.Sc. Life Sciences (2007-2009): Jawaharlal Nehru University (JNU), New Delhi, India
B.Sc. Biotechnology (2004-2007): Jamia Millia Islamia (JMI), New Delhi, India
Awards and Fellowships
Junior and Senior Research Fellowship (2009-2014): Council of Scientific and Industrial Research (CSIR), New Delhi, India
GATE (Graduate Aptitude Test in Engineering): Qualified in years 2008 and 2009
Scientific Contributions/ Recognitions
Associate editor: Journal of Translational Medicine.
Editorial Board Member of Journal: Theoretical Biology and Medical Modelling.
Reviewer: PloS One, BMC Genomics, BMC Bioinformatics, BMC Biology, BMC Biotechnology, Frontiers in Physiology and several other journals.
Web Resources/ Databases (Developed/ Contributed)
A Platform for Designing Genome-Based Personalized Immunotherapy or Vaccine against Cancer (http://www.imtech.res.in/raghava/cancertope/)
GenomeABC: A webserver for benchmarking of genome assemblers. (http://crdd.osdd.net/raghava/genomeabc/).
Genomics web portal page. (http://crdd.osdd.net/raghava/genomesrs/).
Map/Alignment module of CancerDr: Cancer Drug Resistance Database. (http://crdd.osdd.net/raghava/cancerdr/).
Short reads and contigs alignment module of PCMDB: Pancreatic cancer methylation database. (http://crdd.osdd.net/raghava/pcmdb/).
Burkholderia sp. SJ98 database. (http://crdd.osdd.net/raghava/genomesrs/burkholderia/).
Rhodococcus imtechensis RKJ300 database. (http://crdd.osdd.net/raghava/genomesrs/rkj300/).
Genotrick: A pipeline for whole genome assembly and annotation of Genomes (http://crdd.osdd.net/raghava/genomesrs/genotrick/)
Development of Debian packages in OSDDlinux: A Customized Operating System for Drug Discovery. (http://osddlinux.osdd.net/).
A Web-Based Platform for Designing Vaccines against Existing and Emerging Strains of Mycobacterium tuberculosis. (http://crdd.osdd.net/raghava/mtbveb/).