A Perl/C++ package that provides genome-wide detection of structural variants from next generation paired-end sequencing reads. BreakDancer sensitively and accurately detected indels ranging from 10 base pairs to 1 megabase pair that are difficult to detect via a single conventional approach.
Finds genomic rearrangements, including translocations, inversions and deletions. FACTERA can perform with high specificity without compromising sensitivity. It is able to define fusion genes and breakpoints in targeted sequencing data. This tool is applicable on paired-end and soft-clipped reads and is useful for whole genome shotgun sequencing investigation. It aligns all soft-clipped and unmapped reads against each candidate fusion sequence.
An approach that uses a 'kmer' strategy to assemble misaligned sequence reads for predicting insertions, deletions, inversions, tandem duplications and translocations at base-pair resolution in targeted resequencing data. Variants are predicted by realigning an assembled consensus sequence created from sequence reads that were abnormally aligned to the reference genome. Using targeted resequencing data from tumor specimens with orthogonally validated SV, non-tumor samples and whole-genome sequencing data, BreaKmer had a 97.4% overall sensitivity for known events and predicted 17 positively validated, novel variants.
A method that identifies SVs and their precise breakpoints from whole-genome resequencing data. PRISM uses a split-alignment approach informed by the mapping of paired-end reads, hence enabling breakpoint identification of multiple SV types, including arbitrary-sized inversions, deletions and tandem duplications.
Conducts multiple splits at arbitrary locations in a read. Gustaf can deal with single-end and paired-end reads. It discovers local alignments of a read, and then chains local alignments into a semi-global read-to-reference alignment. This tool recognizes dispersed duplications and intra-chromosomal translocations with exact breakpoints. It utilizes standard graph algorithms to assess relationships of the alignments.
Detects structural variations (SVs) in mate pair (MP) datasets. Ulysses is a paired-end method (PEM)-based software including an SV scoring module, which improves SV detection accuracy in MP libraries. This method can annotate the full spectrum of SV, including deletions (DEL), segmental duplications (DUP), inversions (INV), small insertions (sINS, with a size smaller than the library IS), large insertions (INS), reciprocal translocations (RTs) and non-reciprocal translocations (NRT).
A probabilistic method for somatic structural variation (SV) prediction by jointly modeling discordant and concordant read counts. PSSV is specifically designed to predict somatic deletions, inversions, insertions and translocations by considering their different formation mechanisms. Simulation studies demonstrate that PSSV outperforms existing tools. PSSV has been successfully applied to breast cancer data to identify somatic SVs of key factors associated with breast cancer development.
Integrates calls from one or more breakpoint detection methods and classifies the structural variant (SV). CLOVE can build a graph data structure from the provided breakpoint information and then looks for patterns that are characteristic of more complex rearrangement types. It is able to classify complex events from the data. The tool is a flexible method to utilize in any SV calling pipeline. It can process joint inputs from multiple methods as an attractive feature.
Provides a structural variation (SV) caller for long reads. Sniffles is mainly designed for PacBio reads, but also works on Oxford Nanopore reads. SV are larger events on the genome (e.g. deletions, duplications, insertions, inversions and translocations). Sniffles can detect all of these types and more such as nested SVs (e.g. inversion flanked by deletions or an inverted duplication). Furthermore, Sniffles incorporates multiple auto tuning functions to determine data set depending parameter to reduce the overall risk of falsely infer SVs.
Identifies structural variants from de novo assemblies. PAVFinder is able to detect translocations, inversions, duplications, insertions, deletions, simple-repeat expansions/contractions for genomic structural variants. It can be applied to transcriptomic structural variants and transcriptomic splice variants to find information such as gene fusions, partial tandem duplications (PTD), skipped exons or retained introns between others.
Standardizes the Structural Variation (SV) detection pipeline. SV-AUTOPILOT is a pipeline that can be used on existing computing infrastructure in the form of a Virtual Machine (VM) Image. It provides a “meta-tool” platform for using multiple SV-tools, to standardize benchmarking of tools, and to provide an easy, out-of-the-box SV detection program. In addition, the user can choose which of several alignment algorithms is used in their analysis.
Quantifies evidence for structural variation in genomic regions suspected of harboring rearrangements. SV-STAT extends existing methods by adjusting a chimeric read’s support of a structural variation by (i) the number of its soft-clipped bases and (ii) the quality of its alignment to the junction. SV-STAT is more accurate than alternative methods for determining base-pair resolved breakpoints. SV-STAT is a significant advance towards accurate detection and genotyping of genomic rearrangements from DNA sequencing data.
PhD ès Neurosciences, I worked 8 years on the brain and its diseases. I then specialized in bioinformatics (NGS, epigenetics) and worked in CEA and GENETHON before to join OMICX and help OMICtools community.
Gene fusion detection in Plants
Fusion transcripts (i.e., chimeric RNAs) resulting from gene fusions are well known in case of human. But, in plants, this phenomenon is not yet explored. We are planning to discover the fusion transcripts/gene fusions in different type of plants by using RNA-Seq datasets. Further, we are planning to understand the mechanism of gene fusion formation and significance of fusions in plants.
Whole genome and transcriptome sequencing data analysis of Plants
In this era of Next Generation Sequencing (NGS), there is huge amount of sequencing data available in the public domain. Any novel finding from these available datasets is major challenge for a computational biologist. We are interested in the analysis of whole genome and transcriptome sequencing data of different plants to fetch out the useful information from those datasets, with the help of bioinformatics tools. Currently, we are planning to study the gene clusters of secondary metabolite pathways in different plants.
Development of webservers, databases and computational pipelines for plant research
Development of database is necessary to compile and share the information with scientific community. We are dedicated to develop useful databases and webserver for plant research.
Another area of interest is to develop automated pipelines and tools for the analysis of high throughput genomics data, generated by NGS technologies.
Professional & Academic Background
Staff Scientist II (May 2017- present): National Institute of Plant Genome Research (NIPGR), New Delhi, India
Postdoctoral Research Associate (2015-2017): University Of Virginia, Charlottesville, VA, USA
Research Scientist (2014-2015): Sir Ganga Ram Hospital, New Delhi, India
PhD Bioinformatics (2009-2014): Bioinformatics Centre, Institute of Microbial Technology (IMTECH), Chandigarh under Jawaharlal Nehru University (JNU), New Delhi, India
M.Sc. Life Sciences (2007-2009): Jawaharlal Nehru University (JNU), New Delhi, India
B.Sc. Biotechnology (2004-2007): Jamia Millia Islamia (JMI), New Delhi, India
Awards and Fellowships
Junior and Senior Research Fellowship (2009-2014): Council of Scientific and Industrial Research (CSIR), New Delhi, India
GATE (Graduate Aptitude Test in Engineering): Qualified in years 2008 and 2009
Scientific Contributions/ Recognitions
Associate editor: Journal of Translational Medicine.
Editorial Board Member of Journal: Theoretical Biology and Medical Modelling.
Reviewer: PloS One, BMC Genomics, BMC Bioinformatics, BMC Biology, BMC Biotechnology, Frontiers in Physiology and several other journals.
Web Resources/ Databases (Developed/ Contributed)
A Platform for Designing Genome-Based Personalized Immunotherapy or Vaccine against Cancer (http://www.imtech.res.in/raghava/cancertope/)
GenomeABC: A webserver for benchmarking of genome assemblers. (http://crdd.osdd.net/raghava/genomeabc/).
Genomics web portal page. (http://crdd.osdd.net/raghava/genomesrs/).
Map/Alignment module of CancerDr: Cancer Drug Resistance Database. (http://crdd.osdd.net/raghava/cancerdr/).
Short reads and contigs alignment module of PCMDB: Pancreatic cancer methylation database. (http://crdd.osdd.net/raghava/pcmdb/).
Burkholderia sp. SJ98 database. (http://crdd.osdd.net/raghava/genomesrs/burkholderia/).
Rhodococcus imtechensis RKJ300 database. (http://crdd.osdd.net/raghava/genomesrs/rkj300/).
Genotrick: A pipeline for whole genome assembly and annotation of Genomes (http://crdd.osdd.net/raghava/genomesrs/genotrick/)
Development of Debian packages in OSDDlinux: A Customized Operating System for Drug Discovery. (http://osddlinux.osdd.net/).
A Web-Based Platform for Designing Vaccines against Existing and Emerging Strains of Mycobacterium tuberculosis. (http://crdd.osdd.net/raghava/mtbveb/).