Computational protocol: The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications

Similar protocols

Protocol publication

[…] The BioHackathon followed a use-case-driven model. First, genome biologists having developmental, evolutionary, genetic and medical interests explained their data retrieval, integration and analysis requirements. From these, four use-cases were developed spanning three general domains of genomics data.To address the use cases outlined in the Table , developers of the end-user client tools ANNOTATOR, Galaxy, BioMart, TogoDB, jORCA and Taverna presented the features of their projects at the BioHackathon and how they might be utilized to solve the use cases, and then collaboratively worked toward resolution for each. The following sections summarize successes and failures for each use case in the context of the tool or framework being applied, and an evaluation of each tool in comparison to alternatives. In general, the participants achieved operational results within the time span of the BioHackathon, which demonstrates some of the strengths of Web services in being accessible from any of the participants' computers and, in principle, programmable in any language. [...] As the number of sequences in this case is relatively large, sequences should first be annotated using a high-throughput system such as Blast2GO [] or KAAS []. BLAST2GO is a tool to annotate many sequences at once with gene ontology (GO) definitions based on BLAST sequence similarity. KEGG Automatic Annotation Server (KAAS) is a service for functional annotation of sequences by assigning them to KEGG pathways. After using these tools, ANNOTATOR can be used to perform deeper analysis on the remaining difficult-to-annotate sequences, as ANNOTATOR provides highly detailed analysis. Specifically, ANNOTATOR provides functionality to predict protein function based on physicochemical characteristics such as secondary structure of a protein or prediction of transmembrane regions, which can give insight to predict molecular characteristics and cellular functions of the protein. To accomplish this, ANNOTATOR integrates various bioinformatics algorithms and Web services that sometimes take very long time to run. This is why the prototype used ANNOTATOR for only those proteins that are difficult to annotate by sequence similarity. BioMart can subsequently be used to join annotated sequences stored in a local BioMart server with related annotation in the remote Ensembl [] database. Finally, finished annotations can be published as a simple database using TogoDB and the result can be integrated into workflow managers like jORCA or Taverna through (TogoWS) Web services. Figure shows the steps of this workflow. […]

Pipeline specifications

Software tools BioMart, jORCA, Blast2GO, KAAS, TogoWS
Applications Miscellaneous, Genome annotation
Organisms Homo sapiens, Drosophila melanogaster