Skip to main content

Table 1 Summary of technical problems and solutions for each use case

From: The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications

Use Case 1    Annotation of 100,000 invertebrate ESTs
   Task A researcher needs to annotate 100,000 sequences obtained from an invertebrate species and also needs to provide
the result as a public database.
   Strategy Annotate sequences by similarity and complement these annotations for sequences showing no similarity by integrated
analysis tools. Then, store the results into BioMart or TogoDB to make the database publicly available.
   Problem Needed to identify which tool was most suitable for each step. Some tools turned out to require very long time for
execution. The resulting annotations needed to be archived in a database and made accessible on the Web.
   Solution Firstly, use relatively fast tools like Blast2GO and KAAS then use ANNOTATOR for limted number of sequences.
BioMart is suitable for integration of remote BioMart resources like Ensembl,
while TogoDB can be used to host databases without installation.
Both database systems are accessible through the Web service interface for workflow tools like jORCA and Taverna.
   Tools    Blast2GO, KAAS, ANNOTATOR, BioMart, TogoDB, TogoWS, jORCA, Taverna
   Databases    Ensembl, BioMart, KEGG
Use Case 2    TFBS enrichment within differential microarray gene expression data
   Task    Identify SNPs in transcription factor binding sites and visualize the result as a genome browser.
   Strategy    Retrieve SNP and TSS datasets through the DAS protocol, then compute enrichment and export results for a DAS viewer.
   Problem    Needed to integrate information from multiple databases and needed to customize the visualization.
   Solution Developed a custom-made prediction system for the data obtained from DAS sources, then customize the Ajax
DAS viewer to show the result in a genomic view.
   Tools    BioDAS, Ajax DAS viewer
   Databases    FESD II, DBTSS
Use Case 3    Protein interactions among enzymes in a KEGG metabolic pathway
   Task    Predict interacting pairs of proteins in a given metabolic pathway.
   Strategy Retrieve enzymes from a specified pathway and search pairs of homologous proteins forming complexes in a
strucuture database.
   Problem Found version incompatilibity of the server and client implementations of SOAP protocol. Non-standard BLAST output
format was returned by PDBj Web service. There were no Web services to calculate phylogenetic profile.
   Solution Switch programming languages according to the service in use. Programs are written to parse BLAST results and to
generate a phylogenetic profile.
   Tools    Java, OCaml, Perl, Ruby, BLAST, DDBJ WABI, PDBj Mine, KEGG API
   Databases    DDBJ, KEGG, PDBj, UniProt
Use Case 4    Analyzing glyco-gene-related diseases
   Task    Find human diseases which are potentially related to SNPs and glycans.
   Stragety Retrieve disease genes and search for homologs in other organisms to which glyco-gene interactions are recoreded,
then search for epitopes to identify glycans and retrieve their structures.
   Problem No Web service existed to query GlycoEpitopeDB and to convert a glycan structure in IUPAC format into KCF format.
The output of OMIM search was in XML including entries which did not contain SNPs.
   Solution    Implemented and registered BioMoby compliant Web services. Wrote custom BeanShell script for a Taverna workflow.
   Tools    Taverna, BioMoby, KEGG API
   Databases    OMIM, H-InvDB, GlycoEpitopeDB, RINGS, Consortium for Functional Glycomics, GlycomeDB, GlycoGene DataBase, KEGG