Compares fluorescence-based sequences across traces obtained from different individuals to identify heterozygous sites for single nucleotide substitutions. PolyPhred is not a standalone application. PolyPhred's functions are integrated with the use of three other programs: Phred, Phrap, and Consed. PolyPhred identifies potential heterozygotes using the base calls and peak information provided by Phred and the sequence alignments provided by Phrap. Potential heterozygotes identified by PolyPhred are marked for rapid inspection using the Consed tool.
A pairwise sequence alignment program designed for the efficient mapping of sequencing reads onto genomic reference sequences. SSAHA reads of most sequencing platforms (ABI-Sanger, Roche 454, Illumina-Solexa) and a range of output formats (SAM, CIGAR, PSL etc.) are supported. A pile-up pipeline for analysis and genotype calling is available as a separate package.
A program for viewing and editing assemblies prepared by the Phrap assembly program. To allow full-feature editing of large datasets while keeping memory requirements low, we developed a viewer, bamScape, that reads billion-read BAM files, identifies and displays problem areas for user review and launches the consed graphical editor on user-selected regions, allowing, in addition to longstanding consed capabilities such as assembly editing, a variety of new features including direct editing of the reference sequence, variant and error detection, display of annotation tracks and the ability to simultaneously process a group of reads.
A software package that conscientiously discovers single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs) in sequence trace files in a fast, reliable, and user-friendly way. novoSNP allows you to easily filter, sort and check the variations found visually and keep track of your verifications.
Allows high-throughput quantification of the proportions of DNA sequences containing SNVs. In reconstruction experiments, QSVanalyzer accurately estimated the known relative proportions of SNVs. By analyzing a large panel of genomic DNA samples, we demonstrate the ability of the software to analyze not only common biallelic SNVs, but also SNVs within a locus at which gene conversion between four genomic paralogs operates, and within another that is subject to CNV.
Provides a high-throughput solution to single nucleotide polymorphism (SNP) prediction using multiple sequence alignments from re-sequencing data. This pipeline integrates a hybrid of customized scripting, existing utilities and machine learning in order to increase the speed and accuracy of SNP calls. The implementation of this pipeline results in significantly improved multiple sequence alignments and SNP identifications when compared with existing solutions. The use of machine learning in the SNP identifications extends the pipeline's application to any eukaryotic species where full genome sequence information is unavailable.
A mutation detection program designed to detect small mutations (1-50 bases) in sequence traces. AutoCSA is capable of detecting both homozygous and heterozygous base substitutions, as well as small insertions and deletions, to a high sensitivity. AutoCSA is split into three main components, pre-processing of the trace file, variant detection and a post-processing stage to remove false positives. It has specifically been written with high throughput environments in mind, so it is easy to automate the analysis of large amounts of data with little manual intervention.
Reports sequence variants from Sanger sequence trace data in a standardized way as recommended by the Human Genome Variation Society (HGVS). GLASS is an intuitive way for novice and experienced users to discover and assess gene variations. GLASS is a bioinformatic implementation of best practices of labs with published know-how in the analysis of many clinically relevant genes.
A desktop program that analyses capillary electropherograms and compares their sequences with a known reference for identification of mutations. The detected sequence variants are then made available for rapid assessment and annotation via a graphical user interface, allowing chosen variants to be exported for reporting and archiving. The program was validated using more than 16,000 diagnostic laboratory sequence traces. Using GeneScreen, a single user requires only a few minutes to identify rare mutations in hundreds of sequence traces, with comparable sensitivity to expensive commercial products.
Analyses the Sanger sequencing data and provides an easy-to-use one-step solution for genetic testing data analysis. ODS seamlessly integrates base calling, single nucleotide variation (SNV) identification, and SNV annotation into one single platform. It also allows laboratorians to manually inspect the quality of the identified SNVs in the final report. ODS can significantly reduce the data analysis time therefore allows Sanger sequencing-based genetic testing to be finished in a timely manner.
A software tool used for aligning multiple overlapping sequence reads by in-silico doubly heterozygous sequence (DHS) formation. The indel size of the in-silico formed DHS indicates the positions in the paired sequences for correct alignment. PrimeIndel is a useful tool for mutation reporting in clinical laboratories.
Detects single nucleotide polymorphisms (SNPs) efficiently from fluorescence based chromatogram data and interprets fluorescence based chromatograms and efficiently detect the corresponding nucleotide variations in an automatic fashion. In this framework, three main heuristic procedures are employed: i) partitioning and re-sampling (PnR) algorithm that may be used to base-call the bases with ambiguous signal, ii) calculation of the observed signal intensity ratio and vicinity intensity ratio, and iii) conversion of the chromatogram inputs to numeric code. VarDetect supersedes existing automatic SNP detection tools through utilization of rules which account for the common sequence reading artifacts, combined with pre-calculated peak content base ratios.
A computation tool to improve the detection of intra-individual SNPs and InDels in direct amplicon sequencing of a diploid. Neither reference sequence nor additional sample was required. Using two real datasets, we demonstrated the usefulness of DiSNPindel in its ability to improve largely the true SNP and InDel discovery rates and reduce largely the missed and false positive rates as compared with existing detection methods. DiSNPindel provides an efficient tool for intra-individual SNP and InDel detection in diploid amplicon sequencing. It will also be useful for identification of DNA variations in expressed sequence tag (EST) re-sequencing.
Displays loci of interest and patterns of residues for any sequence data. Mutation Reporter Tool is an online tool developed to assist scientists with data analysis. The software allows users to analyze phylogenetics nucleotide or amino acid sequence data from any organism. It can be used for both genotyping and serotyping of hepatitis B virus (HBV) without the requirement of computer skills or knowledge of phylogenetics.
Allows to compare, align, and assemble large sets of DNA sequences. PHRAP uses a banded version of the Smith-Waterman-Gotoh algorithm to do pairwise comparisons of the sequences. It compares sequences by searching for pairs of perfectly matching “words” or sequence regions that meet criteria, tries to extend the alignment if a match of the designated word size is found and then scores it. The software uses quality values produced by the PHRED basecaller. Cross match/Swat is included in the PHRAP package.
Discovers mutations generated by genome editing tools such as CRISPR/Cas9, ZNFs or TALENs. Indigo is a rapid single nucleotide polymorphism (SNP) & Insertion-Deletion (InDel) discovery method in Chromatogram traces obtained from Sanger sequencing of polymerase chain reaction (PCR) products. It can separate a mutated and wildtype allele and aligns both alleles against a reference sequence or wildtype chromatogram. Indigo can be run online as a web application or compiled from source.
A tool for detecting germline microsatellite instability in mismatch repair-deficient subjects. PeakHeights can quickly and simply determine a parameter called the gMSI ratio, based on alterations in electropherogram profile due to microsatellite instability. This can be used as a high-throughput screen in patients whose clinical picture suggests the possibility of biallelic mutations in one of the mismatch repair genes such as PMS2 or MSH2. PeakHeights is also a flexible and powerful general tool for determining signal intensities of selected peaks within ABI electropherograms and exporting them into spreadsheets for easy manipulation. Large batches of samples can be processed at once.
Efficiently aligns DNA sequencing reads with a reference genome. SMALT employs a hash index of short words, sampled at equidistant steps along the genomic reference sequences to work. It reports the best gapped alignments of each read and a score for the reliability of the best mapping. This tool permits to customize the trade-off between sensitivity and speed. It is useful to discover split (chimeric) reads.
A DNA sequencing analysis software capable of performing variant analysis of up to 2000 Sanger sequencing files generated by Applied Biosystems Genetic Analyzers, MegaBACE as well as Beckman CEQ electrophoresis systems.
A computer program for the detection of small sub-populations of molecules carrying indels using ABI trace files. CHILD compares the sequence of the strongest base calls at each position with the sequence of the second-best calls. Alignment of the two sequences and shuffling tests are then used to test whether the sequences generated from the secondary peaks (namely the secondary sequence) are not random, i.e. represent a shifted version of the primary sequence.
Detects substitution and indel SNPs in sequencing traces. InSNP uses simple algorithms to detect the mutations and presents the sequences in compact visualizations that let you quickly decide which ones are real.
Gene fusion detection in Plants
Fusion transcripts (i.e., chimeric RNAs) resulting from gene fusions are well known in case of human. But, in plants, this phenomenon is not yet explored. We are planning to discover the fusion transcripts/gene fusions in different type of plants by using RNA-Seq datasets. Further, we are planning to understand the mechanism of gene fusion formation and significance of fusions in plants.
Whole genome and transcriptome sequencing data analysis of Plants
In this era of Next Generation Sequencing (NGS), there is huge amount of sequencing data available in the public domain. Any novel finding from these available datasets is major challenge for a computational biologist. We are interested in the analysis of whole genome and transcriptome sequencing data of different plants to fetch out the useful information from those datasets, with the help of bioinformatics tools. Currently, we are planning to study the gene clusters of secondary metabolite pathways in different plants.
Development of webservers, databases and computational pipelines for plant research
Development of database is necessary to compile and share the information with scientific community. We are dedicated to develop useful databases and webserver for plant research.
Another area of interest is to develop automated pipelines and tools for the analysis of high throughput genomics data, generated by NGS technologies.
Professional & Academic Background
Staff Scientist II (May 2017- present): National Institute of Plant Genome Research (NIPGR), New Delhi, India
Postdoctoral Research Associate (2015-2017): University Of Virginia, Charlottesville, VA, USA
Research Scientist (2014-2015): Sir Ganga Ram Hospital, New Delhi, India
PhD Bioinformatics (2009-2014): Bioinformatics Centre, Institute of Microbial Technology (IMTECH), Chandigarh under Jawaharlal Nehru University (JNU), New Delhi, India
M.Sc. Life Sciences (2007-2009): Jawaharlal Nehru University (JNU), New Delhi, India
B.Sc. Biotechnology (2004-2007): Jamia Millia Islamia (JMI), New Delhi, India
Awards and Fellowships
Junior and Senior Research Fellowship (2009-2014): Council of Scientific and Industrial Research (CSIR), New Delhi, India
GATE (Graduate Aptitude Test in Engineering): Qualified in years 2008 and 2009
Scientific Contributions/ Recognitions
Associate editor: Journal of Translational Medicine.
Editorial Board Member of Journal: Theoretical Biology and Medical Modelling.
Reviewer: PloS One, BMC Genomics, BMC Bioinformatics, BMC Biology, BMC Biotechnology, Frontiers in Physiology and several other journals.
Web Resources/ Databases (Developed/ Contributed)
A Platform for Designing Genome-Based Personalized Immunotherapy or Vaccine against Cancer (http://www.imtech.res.in/raghava/cancertope/)
GenomeABC: A webserver for benchmarking of genome assemblers. (http://crdd.osdd.net/raghava/genomeabc/).
Genomics web portal page. (http://crdd.osdd.net/raghava/genomesrs/).
Map/Alignment module of CancerDr: Cancer Drug Resistance Database. (http://crdd.osdd.net/raghava/cancerdr/).
Short reads and contigs alignment module of PCMDB: Pancreatic cancer methylation database. (http://crdd.osdd.net/raghava/pcmdb/).
Burkholderia sp. SJ98 database. (http://crdd.osdd.net/raghava/genomesrs/burkholderia/).
Rhodococcus imtechensis RKJ300 database. (http://crdd.osdd.net/raghava/genomesrs/rkj300/).
Genotrick: A pipeline for whole genome assembly and annotation of Genomes (http://crdd.osdd.net/raghava/genomesrs/genotrick/)
Development of Debian packages in OSDDlinux: A Customized Operating System for Drug Discovery. (http://osddlinux.osdd.net/).
A Web-Based Platform for Designing Vaccines against Existing and Emerging Strains of Mycobacterium tuberculosis. (http://crdd.osdd.net/raghava/mtbveb/).