Allows quantification and visualization of CRISPR-Cas9 outcomes, as well as evaluation of effects on coding sequences, noncoding elements and selected off-target sites. CRISPResso is a suite of computational tools that offers several features, including batch sample analysis via command line interface, integration with other pipelines, tunable parameters of sequence quality and alignment fidelity, discrete measurement of insertions, deletions, and nucleotide substitutions, and distinction between non-homologous end joining (NHEJ), homology-directed repair (HDR), and mixed mutation events.
Assists users in obtaining mechanistic insights into genetic dependencies. CRISPRO is a computational pipeline that was developed to elucidate functional residues and predict phenotypic outcome of genome editing. It uses CRISPR tiling screens, protein and nucleotide sequence level annotations, and 3D visualization of protein structure. It can also be used for calculation of functional scores per guide RNA by using next generation sequencing (NGS) data as input.
Identifies and reconstructs CRISPR loci from raw metagenomic data without the need for assembly or prior knowledge of CRISPR in the data set. CRISPR in assembled data are often fragmented across many contigs/scaffolds and do not fully represent the population heterogeneity of CRISPR loci. Crass identified substantially more CRISPR in metagenomes previously analysed using assembly-based approaches. The increased sensitivity, specificity and speed of Crass will facilitate comprehensive analysis of CRISPRs in metagenomic data sets, increasing our understanding of phage-host interactions and co-evolution within microbial communities.
Allows identification and visualization of CRISPR loci. CRISPRviz detects and extracts repeats and spacers and enables data via a local web server for additional manipulation. This software contains two main components: an extraction pipeline/conversion engine and a web-based front-end. It facilitates swift implementation and can serve as an epidemiological tool by enhancing tracking of micro-evolution in diverging pathogenic strains and as a genomic tool for phylogenetic reconstruction.
Examines and evaluates sequencing reads from clustered regularly interspaced short palindromic repeats (CRISPR) experiments by measuring exact-matches and pattern-searching. CRISPRpic counts every possible mutation in a set of sequencing reads without requiring alignment. This software can be used to analyze genomic alterations generated by different enzymes covering a variety of double strand break (DSB) positions. It supports micro-homology and presents all possibilities for each deletion.
Identifies CRISPRs in large DNA strings, such as genomes and metagenomes. CRT was shown to be a significant improvement over the current technique for CRISPR identification using Patscan. CRT's approach detects repeats directly from a DNA sequence. This leads to a program that is easy to describe and understand, yet it is very fast and memory efficient.
Determines residue positions most suitable for proline mutations designed to stabilize proteins in a target conformation. CRISPro predicts the secondary structure for each residue position with the dictionary of secondary structure of proteins (DSSP). It is useful for creating serologic probes, capable of isolating antibodies that recognize a target shape. This tool creates a list of residue positions that are not compatible with proline mutation and which destabilized the alternative conformation.
Processes batch analysis of CRISPR edits using Sanger data. ICE can quantify the identity and prevalence of edits and it correlates with the current gold standard of amplicon sequencing. This software can deal with single guide, multiplex guide, base editing, and homology-directed repair experiments and is able to process hundreds of CRISPR editing experiments results in a reproducible manner.
Constructs optimal gRNAs for the CRISPR-Cpf1 system. CRISPR-DT can take into account target efficiency and specificity scores. It employs support vector machine (SVM) to determine the target efficiency score for mammals. This tool assists researchers in genome editing. It permits users to set specifications according to experimental goals, and receive target candidates.
Aims to the design of custom single guide RNA (sgRNA) libraries for all organisms with annotated genomes. CLD is suitable for the design of libraries using modified CRISPR enzymes and targeting non-coding regions. This software automates all tasks for the generation of sgRNA libraries. It can design libraries of variable size ranging from a few hundred genes to genome-scale for all annotated genomes available from ENSEMBL. CLD implements the following steps: (i) it downloads and reformats ENSEMBL databases, (ii) predicts and filters sgRNA target sites for a provided list of genes, and (iii) reports the results in a ‘ready-to-order’ library file containing nucleotide sequences for on-chip synthesis and subsequent cloning into target vectors.
A platform to assess the quality of a genome editing experiment only with three mouse clicks. The method evaluates next-generation data to quantify and characterize insertions, deletions and homologous recombination. CRISPR Genome Analyzer provides a report for the locus selected, which includes a quantification of the edited site and the analysis of the different alterations detected. The platform maps the reads, estimates and locates insertions and deletions, computes the allele replacement efficiency and provides a report integrating all the information.
Helps biologists to design the crRNA with improved target specificity for the CRISPR-C2c2 system. CRISPR-RT is a web service that allows a user to upload an RNA sequence, set specifications according to experimental goals, and to receive target candidates for the CRISPR System. Optimal candidates are suggested through consideration of predicted off-target effects. CRISPR-RT allows users to set up a wide range of parameters, making it highly flexible for current and future research in CRISPR-based RNA editing. CRISPR-RT covers major model organisms and can be easily extended to cover other species.
Allows prediction of the cleavage propensity of a genomic site by a given single guide RNA (sgRNA). The CRISTA method incorporates a wide range of features specific to the genomic content, features that define the thermodynamics of the sgRNA, and features concerning the pairwise similarity between the sgRNA and the genomic target. This predictive model represents general patterns of the cleavage machinery across different detection techniques.
A web tool for automated genome wide single guide RNA (sgRNA) design. CRISPR-ERA can provide different sgRNA searching approaches for genome editing, such as Cas9 nuclease. In addition, CRISPR-ERA also generates sgRNAs for gene activation or repression using a large-scale database of CRISPRi in different genomes. Now nine (two bacterial species: E.coli, B. subtilis; one yeast: S. cerevasiae; C. elegans, fruit fly, zebrafish, mouse, rat, human) model organisms are provided in this web tool.
Identifies potential target sites for CRISPR gene editing in DNA sequences. Then, CGAT uses the identified target sequences to search a genome of interest for potential off-target matches. CGAT offers access online, enables search by gene name and predicts off-targets. Furthermore, it enables ranking of the identified targets, and contains all of these functionalities within a single pipeline.
Links each guide RNA to homologous repair cassettes that both edit loci and function as barcodes to track genotype-phenotype relationships. CREATE combines automated design of CREATE cassettes (modular guide RNA-editing oligos), arraybased CREATE cassette synthesis, and sequencing in a streamlined workflow for genome engineering. CREATE was applied to site saturation mutagenesis for protein engineering, reconstruction of adaptive laboratory evolution experiments, and identification of stress tolerance and antibiotic resistance genes in bacteria.
A web application for predicting sgRNA efficiency from spacer sequences. SSC supports the applications of optimizing sgRNA libraries in CRISPR/Cas9 knockout or CRISPR/dCas9 inhibition/activation screens.
A web tool for synthetic single-guide RNA design of CRISPR-system in plants. CRISPR-P allows users to search for high specificity Cas9 target sites within DNA sequences of interest, which also provides off-target loci prediction for specificity analyses and marks restriction enzyme cutting site to every sgRNA for further convenient in experiment.
Represents a tool for CRISPR (clustered regularly interspaced short palindromic repeats) detection in archaea and bacterial genomes that utilizes machine learning. CRF identifies the real CRISPR arrays based on repeat sequence and structural features. A classifier is used to exclude invalid CRISPR arrays from all candidates. In this tool, tandem repeat checking is applied to filter invalid CRISPR candidates.
A bioinformatics tool for genome-wide design of sgRNAs with improved efficiency. A custom design interface was established for gRNA selection based on user-provided sequences. The availability of this program may help to improve the efficiency of CRISPR assay design, leading to significant savings in experimental resources at subsequent screening stage.
A database of CRISPR/Cas9 target sequences that have been experimentally validated in zebrafish. CRISPRz can be searched using multiple inputs such as ZFIN IDs, accession number, UniGene ID, or gene symbols from zebrafish, human and mouse. CRISPRz was developed in an effort to provide a comprehensive list of validated CRISPR targets from published sources as well as from an ongoing genome-wide knockout project in the zebrafish genome. Data will be added as more validated CRISPR targets are published or contributed from unpublished, in-house projects. The database is also open for data submission from the research community.
Utilizes the CRISPRFinder program to identify putative CRISPRs and additional tests to further screen for the smallest CRISPRs in a polyphasic approach. Indeed the CRISPRFinder program is conceived to authorize the largest number of possible CRISPRs, especially the shortest ones, containing one or two spacers. The main idea of the program is to first find possible CRISPR localizations in a genomic sequence and then check if these regions contain a cluster that possess the characteristics of "obvious" CRISPR, i.e. containing at least three repeats.
Provides a single platform to integrate the growing information being generated by a genome editing approach. CrisprGE is an online database that contains over 4680 genes edited by CRISPR/Cas approach. It also includes more than 220 unique genes targeted in about 30 models and other organisms along with different modification induced by repair mechanisms.
Investigates clustered regularly interspaced short palindromic repeats (CRISPR)-Cas (CRISPR-associated proteins) systems. CRISPRminer includes five categories of information: (i) annotation and visualization of CRISPRCas systems; (ii) classification of CRISPR-Cas systems; (iii) gathering and detection of self-targeting events, (iv) inference of putative microbe−phage interactions and relations, and (v) annotation of anti-CRISPR proteins experimentally identified in published papers.
Provides a manually curated database of validated single guide RNA (sgRNA) for long non-coding RNAs (lncRNAs). CRISPRlnc is an online resource that includes the ID, position in the genome, sequence and functional description of lncRNAs, as well as the sequence, protoacceptormotif (PAM), CRISPR type and validity of their corresponding sgRNAs. It also provides tools for browsing, searching and downloading of all the data covered, as well as online BLAST service and genome browse server.
Offers a collection of information and links for scientists interested by the utilization of targetable clustered regularly interspaced short palindromic repeats (CRISPR)/Cas systems for genome engineering and other applications.
Facilitates the use of the CRISPR/Cas9 system as a genome editing tool for functional studies and molecular breeding of grapes. Among other functions, the Grape-CRISPR database allows users to identify and select multi-protospacers for editing similar sequences in grape genomes simultaneously. The database contains two main sections: Search and Design. In the Search section, users can identify appropriate protospacer and protospacer-adjacent motif (PAM) sites of a gene by providing certain inquiry information such as locus location, gene ID or Pfam ID. The Design section is for protospacer design. Users can detect and design protospacers and PAMs in the sequences of interest by using the Perl scripts provided.
Offers Pooled In vitro CRISPR Knockout Library Essentiality Screens. PICKLES allows exploration of gene essentiality profiles of users’ favourite genes across a large set of CRISPR knockout and shRNA knockdown fitness screens, mostly in cancer cell lines. It can display how gene-specific essentiality varies across tissue types and, in many cases, the relationship with gene expression levels in the same cells.
Allows users to search from over 100,000 genomes and 9,000 species in order to make a Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) guide for gene knockout. CRISPR Knockout Guide Designer permits to visualize recommendations for knockout with less off-target effects. It’s possible to validate guides created with other tools or consult the locations of sequence within the gene.
Provides access to data about anti-CRISPR proteins. Anti-CRISPRdb is an online resource that contains more than 400 anti-CRISPR proteins tested by experimental and bioinformatics methods. The database allows users to browse, search, blast, screen, and download data on their anti-CRISPR proteins of interest, as well as sharing data on validated/potential anti-CRISPR proteins with other related scientific communities.
Allows rapid identification of sgRNA target sequences in the Chinese hamster ovary (CHO-K1) genome. The CRISPy tool identified 1,970,449 CRISPR targets divided into 27,553 genes and lists the number of off-target sites in the genome. The proven functionality of Cas9 to edit CHO genomes combined with the CRISPy database have the potential to accelerate genome editing and synthetic biology efforts in CHO cells.
Aims to facilitate users research using CRISPR technology. ‘Genome-wide gRNA databases for CRISPR genome editing and transcription activation’ provides genome-wide databases containing pre-validated gRNA sequences. It contains pre-validated gRNA sequences targeting genes in the human and in the mouse genome. It includes 2 resources: a gRNA database in which SpCas9 gRNA sequences are targeted to constitutive exons and designed for minimal off-target effects, and a SAM database that targets the first 200bp upstream of each transcription start site.
Uses methods to compute, visualize and select optimal CRISPR sites in a genome browser environment. The WGE database currently stores single and paired CRISPR sites and pre-calculated off-target information for CRISPRs located in the mouse and human exomes. Scoring and display of off-target sites is simple, and intuitive, and filters can be applied to identify high-quality CRISPR sites rapidly. WGE also provides a tool for the design and display of gene targeting vectors in the same genome browser, along with gene models, protein translation and variation tracks.
Provides a universal CRISPR annotation system. grID is an extensive compilation of gRNA properties including sequence and variations, thermodynamic parameters, off-target analyses, and alternative PAM sites, among others. The database is designed to keep up with the rapidly evolving CRISPR technology. Users can search in the database by NCBI reference sequence ID, Gene Symbol or any valid 23-bp targeting sequence in the form N20NGG.
A database for high-throughput CRISPR/Cas9 screening experiments. GenomeCRISPR contains data on the performance of more than 550 000 single guide RNAs (sgRNAs) which were used in >80 different experiments performed in 48 different human cell lines. It provides several data mining options and tools allowing users to easily investigate and compare the results of different screens. An API can be used for automated data access.
Assists users in the construction of a sequence-optimized gRNA library. TKO is based on an expanded set of reference core essential genes (CEG2), plus empirical data from six clustered regularly interspaced short palindromic repeats (CRISPR) knockout screens. It contains four sequence-optimized guides targeting each of more than 18 000 protein-coding genes.
Gathers a large number of available resources on human super-enhancers. SEdb enables the annotation of potential cell specific functions in gene regulation. It contains genetic and epigenetic information about super-enhancers including common single nucleotide polymorphisms (SNPs), motif changes, expression quantitative trait loci (eQTLs), risk SNPs, transcription factor binding sites (TFBSs), CRISPR/Cas9 target sites, Dnase I hypersensitivity sites (DHSs) and enhancers.
Provides a platform to create guide RNAs (gRNAs). Cpf1-Database is a web application that allows users to select the targeted genes and select the optimized gRNAs through a graphic interface with flexible filtering parameters. Users can perform its editing from a repository of determined targets of Cpf1 endonucleases comprising 5’-TTTN-3’ PAM sequences in all coding sequence (CDS) regions in the complete genome of 12 organisms.
Gene fusion detection in Plants
Fusion transcripts (i.e., chimeric RNAs) resulting from gene fusions are well known in case of human. But, in plants, this phenomenon is not yet explored. We are planning to discover the fusion transcripts/gene fusions in different type of plants by using RNA-Seq datasets. Further, we are planning to understand the mechanism of gene fusion formation and significance of fusions in plants.
Whole genome and transcriptome sequencing data analysis of Plants
In this era of Next Generation Sequencing (NGS), there is huge amount of sequencing data available in the public domain. Any novel finding from these available datasets is major challenge for a computational biologist. We are interested in the analysis of whole genome and transcriptome sequencing data of different plants to fetch out the useful information from those datasets, with the help of bioinformatics tools. Currently, we are planning to study the gene clusters of secondary metabolite pathways in different plants.
Development of webservers, databases and computational pipelines for plant research
Development of database is necessary to compile and share the information with scientific community. We are dedicated to develop useful databases and webserver for plant research.
Another area of interest is to develop automated pipelines and tools for the analysis of high throughput genomics data, generated by NGS technologies.
Professional & Academic Background
Staff Scientist II (May 2017- present): National Institute of Plant Genome Research (NIPGR), New Delhi, India
Postdoctoral Research Associate (2015-2017): University Of Virginia, Charlottesville, VA, USA
Research Scientist (2014-2015): Sir Ganga Ram Hospital, New Delhi, India
PhD Bioinformatics (2009-2014): Bioinformatics Centre, Institute of Microbial Technology (IMTECH), Chandigarh under Jawaharlal Nehru University (JNU), New Delhi, India
M.Sc. Life Sciences (2007-2009): Jawaharlal Nehru University (JNU), New Delhi, India
B.Sc. Biotechnology (2004-2007): Jamia Millia Islamia (JMI), New Delhi, India
Awards and Fellowships
Junior and Senior Research Fellowship (2009-2014): Council of Scientific and Industrial Research (CSIR), New Delhi, India
GATE (Graduate Aptitude Test in Engineering): Qualified in years 2008 and 2009
Scientific Contributions/ Recognitions
Associate editor: Journal of Translational Medicine.
Editorial Board Member of Journal: Theoretical Biology and Medical Modelling.
Reviewer: PloS One, BMC Genomics, BMC Bioinformatics, BMC Biology, BMC Biotechnology, Frontiers in Physiology and several other journals.
Web Resources/ Databases (Developed/ Contributed)
A Platform for Designing Genome-Based Personalized Immunotherapy or Vaccine against Cancer (http://www.imtech.res.in/raghava/cancertope/)
GenomeABC: A webserver for benchmarking of genome assemblers. (http://crdd.osdd.net/raghava/genomeabc/).
Genomics web portal page. (http://crdd.osdd.net/raghava/genomesrs/).
Map/Alignment module of CancerDr: Cancer Drug Resistance Database. (http://crdd.osdd.net/raghava/cancerdr/).
Short reads and contigs alignment module of PCMDB: Pancreatic cancer methylation database. (http://crdd.osdd.net/raghava/pcmdb/).
Burkholderia sp. SJ98 database. (http://crdd.osdd.net/raghava/genomesrs/burkholderia/).
Rhodococcus imtechensis RKJ300 database. (http://crdd.osdd.net/raghava/genomesrs/rkj300/).
Genotrick: A pipeline for whole genome assembly and annotation of Genomes (http://crdd.osdd.net/raghava/genomesrs/genotrick/)
Development of Debian packages in OSDDlinux: A Customized Operating System for Drug Discovery. (http://osddlinux.osdd.net/).
A Web-Based Platform for Designing Vaccines against Existing and Emerging Strains of Mycobacterium tuberculosis. (http://crdd.osdd.net/raghava/mtbveb/).
Topics (8): Genome annotation, De novo sequencing analysis, Gardnerella vaginalis, Homo sapiens, Escherichia coli, Vaginosis, Bacterial, Genital Diseases, Female, Drug-Related Side Effects and Adverse Reactions