Crossbow specifications


Unique identifier OMICS_00284
Name Crossbow
Software type Framework/Library, Pipeline/Workflow
Interface Command line interface
Restrictions to use None
Operating system Unix/Linux
Parallelization MapReduce
License GNU General Public License version 3.0
Computer skills Advanced
Version 1.2.1
Stability Stable
Bowtie, SOAPsnp
High performance computing Yes
Maintained Yes



  • person_outline Ben Langmead <>

Additional information

Publication for Crossbow

Crossbow in pipelines

PMCID: 4955563
PMID: 27195526
DOI: 10.1155/2016/3617572

[…] []. the parallel processing of many small pieces of data greatly reduces computation time. examples of open-source software developed on the hadoop platform for processing genomic data are crossbow [], gatk [], and hadoop-bam []. challenges to the use of cloud computing for genomic data include the long data transfer times for uploading ngs data files to the cloud, the perceived lack […]

PMCID: 3655485
PMID: 23710461
DOI: 10.1155/2013/791051

[…] the readers to the documentation available in the aws website []. , currently, there are some cloud-based programs for the analysis of next-generation sequencing data. these include, among others, crossbow [], rsd-cloud [], myrna [], and cloudburst []. in addition, there are some libraries and packages that support the creation and management of computer clusters in the cloud. to the best […]

[…] for ngs data processing. furthermore, various bioinformatics programs are already based on the mapreduce framework and are demonstrated to work using the emr product. examples of these tools include crossbow [], rsd-cloud [], myrna [], and cloudburst []., the client program of elastream can create an emr cluster using the apis of aws from the user's local machine. the creation steps are similar […]

PMCID: 3694645
PMID: 23813006
DOI: 10.1093/bioinformatics/btt215

[…] solution is to align the newly sequenced reads to a single reference genome and then query the genomic variation databases to analyze the mismatches. this approach is used in programs such as crossbow (), varscan (), and others (; ). however, if the reads are misaligned during the first step (e.g. reads spanning a mutation), incorrect mismatch locations will be propagated to the second […]

PMCID: 2904649
PMID: 20622843
DOI: 10.1038/nbt0710-691

[…] frameworks like hadoop. this takes expertise and time. a mitigating factor is that hadoop's “streaming mode” allows existing non-parallel tools to be used as computational steps. for instance, crossbow uses the non-cloud programs bowtie and soapsnp, albeit with some small changes to format intermediate data for the hadoop framework. new parallel programming frameworks, such as dryadlinq […]

Crossbow in publications

PMCID: 4709609
PMID: 26839887
DOI: 10.1155/2015/807407

[…] environments. a simple mapreduce model deployed on the vm based computing environment is shown in ., next-generation sequencing (ngs) tools like cloudaligner [], cloud burst [], seqmapreduce [], and crossbow [] adopt the hadoop framework. the major drawback of these alignment tools is that they are short read aligners. the short read aligners prove to be efficient when single-gap or ungapped […]

PMCID: 4172764
PMID: 25247298
DOI: 10.1371/journal.pone.0108490

[…] web services emr on amazon ec2 instances and gce. paired-end sequence reads of publicly available genomic datasets (escherichia coli cc102 strain and a han chinese male genome) were analysed using crossbow, a genetic annotation tool, on hadoop-based platforms with equivalent system specifications –. a standard analytical pipeline was run simultaneously on both platforms multiple times ( and ). […]

PMCID: 3832998
PMID: 24288665
DOI: 10.1155/2013/185679

[…] a suite of tools that allow scientists to orchestrate a sequence of data analysis tasks using remote computing resources and data storage facilities on demand from local devices. furthermore, the crossbow [] genotyping program applies the mapreduce workflow on hadoop to launch many copies of the short-read aligner bowtie [] in parallel. once the aligned reads are generated, hadoop […]

[…] for the scientists with limited computational power. cloud-based software tools have been developed by the academic community for the analysis of biological sequences. these include, among others, crossbow [], rsd-cloud [], myrna [], and cloudburst []. the life science industry has moved in the same direction and started to support cloud computing as well. interestingly, recent ngs instruments […]

PMCID: 3647537
PMID: 23671843
DOI: 10.1155/2013/614923

[…] and the distributed file system are robust against failure. several sequence analysis tools have been redeveloped as cloud tools based on the hadoop architecture, such as cloudblast [] and crossbow []. therefore, standard online tools can be ported to the cloud architecture. such importing of preexisting tools constitutes the main goal of bioinformatics as a service (baas)., […]

Crossbow institution(s)
Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA; Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD, USA; The iSchool, College of Information Studies, University of Maryland, College Park, MD, USA
Crossbow funding source(s)
Supported in part by NSF grant IIS-0844494, NIH grants R01-LM006845 and R01-HG004885.

