Skip to main content

Table 1 Summary of technical problems and solutions for each use case

From: The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications

Use Case 1

   Annotation of 100,000 invertebrate ESTs

   Task

A researcher needs to annotate 100,000 sequences obtained from an invertebrate species and also needs to provide

the result as a public database.

   Strategy

Annotate sequences by similarity and complement these annotations for sequences showing no similarity by integrated

analysis tools. Then, store the results into BioMart or TogoDB to make the database publicly available.

   Problem

Needed to identify which tool was most suitable for each step. Some tools turned out to require very long time for

execution. The resulting annotations needed to be archived in a database and made accessible on the Web.

   Solution

Firstly, use relatively fast tools like Blast2GO and KAAS then use ANNOTATOR for limted number of sequences.

BioMart is suitable for integration of remote BioMart resources like Ensembl,

while TogoDB can be used to host databases without installation.

Both database systems are accessible through the Web service interface for workflow tools like jORCA and Taverna.

   Tools

   Blast2GO, KAAS, ANNOTATOR, BioMart, TogoDB, TogoWS, jORCA, Taverna

   Databases

   Ensembl, BioMart, KEGG

Use Case 2

   TFBS enrichment within differential microarray gene expression data

   Task

   Identify SNPs in transcription factor binding sites and visualize the result as a genome browser.

   Strategy

   Retrieve SNP and TSS datasets through the DAS protocol, then compute enrichment and export results for a DAS viewer.

   Problem

   Needed to integrate information from multiple databases and needed to customize the visualization.

   Solution

Developed a custom-made prediction system for the data obtained from DAS sources, then customize the Ajax

DAS viewer to show the result in a genomic view.

   Tools

   BioDAS, Ajax DAS viewer

   Databases

   FESD II, DBTSS

Use Case 3

   Protein interactions among enzymes in a KEGG metabolic pathway

   Task

   Predict interacting pairs of proteins in a given metabolic pathway.

   Strategy

Retrieve enzymes from a specified pathway and search pairs of homologous proteins forming complexes in a

strucuture database.

   Problem

Found version incompatilibity of the server and client implementations of SOAP protocol. Non-standard BLAST output

format was returned by PDBj Web service. There were no Web services to calculate phylogenetic profile.

   Solution

Switch programming languages according to the service in use. Programs are written to parse BLAST results and to

generate a phylogenetic profile.

   Tools

   Java, OCaml, Perl, Ruby, BLAST, DDBJ WABI, PDBj Mine, KEGG API

   Databases

   DDBJ, KEGG, PDBj, UniProt

Use Case 4

   Analyzing glyco-gene-related diseases

   Task

   Find human diseases which are potentially related to SNPs and glycans.

   Stragety

Retrieve disease genes and search for homologs in other organisms to which glyco-gene interactions are recoreded,

then search for epitopes to identify glycans and retrieve their structures.

   Problem

No Web service existed to query GlycoEpitopeDB and to convert a glycan structure in IUPAC format into KCF format.

The output of OMIM search was in XML including entries which did not contain SNPs.

   Solution

   Implemented and registered BioMoby compliant Web services. Wrote custom BeanShell script for a Taverna workflow.

   Tools

   Taverna, BioMoby, KEGG API

   Databases

   OMIM, H-InvDB, GlycoEpitopeDB, RINGS, Consortium for Functional Glycomics, GlycomeDB, GlycoGene DataBase, KEGG