Volume 2 Supplement 1
Semantics-based composition of EMBOSS services
© Lamprecht et al; licensee BioMed Central Ltd. 2011
Published: 7 March 2011
More than in other domains the heterogeneous services world in bioinformatics demands for a methodology to classify and relate resources in a both human and machine accessible manner. The Semantic Web, which is meant to address exactly this challenge, is currently one of the most ambitious projects in computer science. Collective efforts within the community have already led to a basis of standards for semantic service descriptions and meta-information. In combination with process synthesis and planning methods, such knowledge about types and services can facilitate the automatic composition of workflows for particular research questions.
In this study we apply the synthesis methodology that is available in the Bio-jETI workflow management framework for the semantics-based composition of EMBOSS services. EMBOSS (European Molecular Biology Open Software Suite) is a collection of 350 tools (March 2010) for various sequence analysis tasks, and thus a rich source of services and types that imply comprehensive domain models for planning and synthesis approaches. We use and compare two different setups of our EMBOSS synthesis domain: 1) a manually defined domain setup where an intuitive, high-level, semantically meaningful nomenclature is applied to describe the input/output behavior of the single EMBOSS tools and their classifications, and 2) a domain setup where this information has been automatically derived from the EMBOSS Ajax Command Definition (ACD) files and the EMBRACE Data and Methods ontology (EDAM). Our experiments demonstrate that these domain models in combination with our synthesis methodology greatly simplify working with the large, heterogeneous, and hence manually intractable EMBOSS collection. However, they also show that with the information that can be derived from the (current) ACD files and EDAM ontology alone, some essential connections between services can not be recognized.
Our results show that adequate domain modeling requires to incorporate as much domain knowledge as possible, far beyond the mere technical aspects of the different types and services. Finding or defining semantically appropriate service and type descriptions is a difficult task, but the bioinformatics community appears to be on the right track towards a Life Science Semantic Web, which will eventually allow automatic service composition methods to unfold their full potential.
Research projects in modern molecular biology rely on increasingly complex combinations of computational methods to handle the data that is produced in the life science laboratories. The plethora and kind of data involved in modern research in the field of biology is only accessible by computational methods. Bioinformatics algorithms, tools, and databases, are available in various ways, developed by different groups, in different contexts, using different technologies. The abundance of heterogeneous resources provided by different institutes all over the world leads to the problem of finding the right service for a certain task. The Semantic Web  aims at thoroughly equipping individual data and services with machine-processable meta-information in order to simplify the discovery of relevant resources. The importance of properly semantically annotated data and services has been recognized by the life science community earlier than by other application domains, and thus various projects have made significant progress towards a Semantic Web for bioinformatics . Making no claim to be complete, the following list of projects characterizes the current state of the art:
● BioMoby  is an open bioinformatics web services registry, which particularly started the modeling of the bioinformatics domain. Making use of service and type meta-data and ontologies for classifying them further, a number of services has been prepared mainly for supporting semantics-based retrieval. However, the native Moby specifications originate from the early 2000s and thus do not adhere to the Semantic Web standards, which have been developed in the last years, but on self-made realizations of the same concepts.
● The SADI (Semantic Automated Discovery and Integration)  framework provides an open service registry that, in contrast to its predecessor BioMoby, uses standards-compliant Semantic Web Service design patterns to deploy and operate bioinformatics web services. In addition to the collection of services, a simple OWL-based ontology is available that classifies the heterogeneous resources further.
● The BioCatalogue  is a recently released, curated registry for life science web services. It provides a comprehensive portal for discovering, registering, annotating and monitoring services that also makes extensive use of different Web 2.0 community features, like collaborative tagging of services and various newsfeeds.
● The myGrid ontology  is one of the sources of information that the BioCatalogue uses. It has been developed with the aim of supporting service discovery. It consists of two parts, namely the service ontology and the domain ontology. The former describes the physical and operational features of web services (e.g., inputs and outputs), while the latter captures descriptions of bioinformatics data types and their relationships.
● The EMBRACE Ontology for Data and Methods (EDAM, ) is an ontology for bioinformatics tools and data, which aims at providing a controlled vocabulary for the diverse services and resources in the Life Science Semantic Web.
The challenge of semantics-based service composition in the bioinformatics application domain has been addressed by a number of projects. For instance, the BioMoby project provides a composition functionality for its services: with the MOBY-S Web Service Browser  it is possible to search for an appropriate next service and store the sequence of executed tools as a Taverna  workflow. Similarly, the REMORA web server  offers functionality for the discovery and step-by-step composition of BioMoby services and the DDBJ’s Web API for biology provides next applicable services according to the outputs of previously executed services . Another example is the scenario presented in , where meaningful terms from the gene expression domain are recognized in the text of a web page and used for the formulation of higher-level goals, which are, together with web services that are linked to the terms, given to an HTN (Hierarchical Task Network) planner in order to create workflows that are suitable within the current context. All these have clear limitations, as their automatic service composition functionality is:
● restricted to small sub-workflows or even single steps of the workflow, which comes with the risk that users get stuck when stepwisely trying to construct the globally intended solution,
● limited to semantically annotated services of the particular platform.
Current tools for the graphical development of bioinformatics workflows [9, 13–16], most of them data-flow based, do not include means for semantic modeling or automatic service composition. An exception is Bio-jETI [17, 18], which bridges this gap by supporting the incorporation of semantically modeled domain information for control-flow oriented process construction. Its holistic perspective covers both the process modeling and the integration of individual services and platforms:
● Process development is addressed from a goal-oriented global perspective. A loose programming concept allows the user to specify the actually intended workflow as a whole, and the synthesis finds shortest solutions directly matching the global intent.
● Service descriptions in terms of the domain model are decoupled from the technical service specifications and implementations, so that any kind of heterogeneous resource at any location can be integrated, and there is no restriction to semantically annotated services of a particular platform.
In this paper we extend a previous case study on the semantics-based composition of EMBOSS services with Bio-jETI . We use two different setups, one manually defined and one automatically generated from available meta-information, and compare their characteristics and the respective synthesis results.
Results and discussion
EMBOSS (European Molecular Biology Open Software Suite [20, 21]) is a collection of freely available tools for the molecular biology user community. It contains a number of small and large programs for a wide range of tasks, such as sequence alignment, database searches, protein motif identification, nucleotide sequence pattern analysis, and codon usage analysis as well as the preparation of data for presentation and publication. As of March 2010, EMBOSS (Release 6.2.0) consists of around 350 tools, some derived from originally standalone packages.
EMBOSS provides a common technical interface for the diverse tools that are contained in the suite. They can be run from the command line, or accessed from other programs. Thus, EMBOSS is also suitable for being set up behind GUIs and web interfaces. What is more, it automatically copes with data in a variety of formats, even allowing for transparent retrieval of sequence data from the web. The EMBOSS tools work seamlessly for a number of different formats and types, and therefore free the user from caring about compatibility and type conflicts. This enables us to focus on the actual service semantics rather than on technical details of data compatibility when setting up the domain.
Roughly speaking, the domain modeling involves everything that is required prior to domain-specific workflow development, such as service integration and providing meta-information about the services and types of the application domain. The actual process modeling is then done by the workflow designer, who benefits from the domain model that has been set up according to his needs, referring to services and data types using familiar terminology. The workflow designer does not need to care about technical details like type consistency. He can mark the connection between certain services as loosely specified, thus leaving the problem of proper type conversion to the synthesis algorithm.
Integration of services.
Description of the input/output behavior of the individual services.
Structuring of the domain by classification of types and services in taxonomies (i.e. simple ontologies that relate entities in terms of is-a relations).
The integration of the EMBOSS services that we used in this study was done automatically. We let a script process the tool directories of the EMBOSS source code repository and create workflow building blocks for all available tools. In the following, we describe two disparate procedures that we used to set up synthesis domains for the EMBOSS suite, regarding the service descriptions and taxonomic classifications of types and services:
● manually, where intuitive, high-level, semantically meaningful nomenclature for types and services is provided by a domain modeler, and
● automatically, where the information about types and services is derived from the EDAM Ontology and the EMBOSS Ajax Command Definition (ACD) files.
In the remainder of this section, we show by means of some workflow examples what the synthesis methodology can infer from these domains and where the principal differences are.
Services in the HMMER subset of the EMBOSS domain.
Local multiple alignment of sequences.
Align sequences to an HMM profile.
Build a profile HMM from an alignment.
Calibrate HMM search statistics.
Convert between profile HMM file formats.
Generate sequences from a profile HMM.
Retrieve an HMM from an HMM database.
Create a binary SSI index for an HMM database.
Search one or more sequences against an HMM database.
Search sequence database with a profile HMM.
Global multiple alignment of sequences.
Create random nucleotide sequences.
Create random protein sequences.
Display a multiple sequence alignment in pretty format.
Display features of a sequence in pretty format.
Displays protein sequences with features in pretty format.
Display sequences with features in pretty format.
Manual domain setup
Extracting information about input and output types from natural language documentations of the services.
Adding classifications of service and types based on further natural language documentations and own knowledge and experiences.
Manually defined domain: services in the HMMER subset.
Automatic domain setup
Generating a skeletal structure for the taxonomies based on the EDAM ontology.
Extracting the definition of the input/output behavior from the tools’ ACD files.
Linking the services and the determined input/output types to the respective EDAM terms in the taxonomies.
The EMBRACE Ontology for Data and Methods (EDAM, ) is an ontology for bioinformatics tools and data, which aims at providing a controlled vocabulary for the diverse services and resources in the life science Semantic Web. The ontology is provided in OBO (Open Biomedical Ontologies)  format. Among others, EDAM contains hierarchical term definitions for tool functions and data types, which we use as basis for our service and type taxonomies. The results presented in this paper are based on the EDAM versionbeta03 (March 2010).
Automatically generated domain: services in the HMMER subset.
protein _sequence_record, hmmer_hidden_markov_model
protein jsequence_record, hmmer_hidden_markov_model
protein _sequence_record, hmmer_hidden_markov_model
Working with the domains
In the previous sections we described the setup of the EMBOSS domain, which is the task of the domain modeler, either by directly defining the domain model (i.e. service descriptions and appropriate type and service taxonomies), or by equipping the services themselves with appropriate meta-information and maintaining ontologies to relate and classify the used terms further, which can be automatically translated into a domain model.
In this section, we illustrate the work of the workflow designer, who develops the actual analysis processes dealing with particular biological questions. Based on the three increasingly complex examples which have been introduced in  we show how synthesis problems are specified and what the synthesis methodology derives from these specifications based on the domains described above.
In case of the manually defined domain, this means that the synthesis algorithm has to find a way from MultipleNucleotideSequence toAlignment. This request can be met by inserting a single multiple sequence alignment service, for exampleemma: MultipleNucleotideSequence is defined as an instance of MultipleSequence by the type taxonomy (cf. Figure 2), which isemma’s input type (cf. Table 2), while its output typeAlignment is directly suitable as input forshowalign (cf. Table 2). Figure 6 (center) shows the resulting process.
In case of the automatically created domain, which uses the terminology from the EDAM ontology and the ACD files, the synthesis problem is to find a sequence of services beginning with makenucseq_seqoutall_output and ending withsequence_alignment_data. As Figure 6 (right side) shows, the synthesis does not find a solution for this problem. The reason for this disconnect is that no service in the domain, especially no sequence alignment service, is annotated to produce the type sequence_alignment_data, which is required as input for theshowalign services (cf. the service characterizations in Table 3). Rather, the alignment servicesedialign andemma have output types that are classified assequence_record (cf. the type taxonomy in Figure 5), so that the synthesis algorithm has no chance to find a possibility to connect them.
We might, however, have a process in mind that does some analysis on the initially generated sequences and produces another set of sequences, for instance via a Profile HMM. As will be detailed in the Methods section, additional constraints can be used in the workflow specification that is given to the synthesis algorithm. For expressing the sketched case, we can give an additional constraint to the synthesis algorithm that enforces the use of the serviceehmmemit.
One of the shortest thus possible processes is shown in Figure 7 (center), obtained by providing the synthesis algorithm with the manually defined domain and an additional constraint that enforced the use ofehmmemit: the initial input sequences are converted into anAlignment byemma, which is then used by ehmmbuild to create a Profile HMM.Ehmmemit emits a set of sequences based on this HMM that are finally displayed byshowfeat. The right side of the figure shows the result of a corresponding synthesis run on the automatically created domain, where again no solution can be found. The reason is basically the same as in the previous example: as the alignment services’ outputs are defined assequence_record rather than as suitable alignment types, the synthesis is not able to recognize them as valid inputs for, e.g.,ehmmbuild.
If we start the synthesis with no further constraints, thousands of possible solutions are found, even if the length of the solution is limited. The reason lies in the nature of the EMBOSS domain: many tools work on very similar input types (sequence), some again producing sequences, so that if the synthesis is only based on the type information, unfathomable many variations of solutions are possible.
Thus we refine our specification and formulate additional constraints for the synthesis in order to get less, but more reasonable results. For instance, we might want the inserted service sequence to end with a service that visualizes a result in some fashion, possibly after having applied some analysis to the sequence. The center of Figure 8 shows the workflow with one of the service sequences that were proposed by the synthesis algorithm for the manually created domain and constraints expressing that we want to “Enforce the use of moduleProtein2dStructure” and “UseDisplay as last service in solution”, where Protein2dStructure andDisplay are abstract service groups. This request is met, for instance, by pepwheel, a service that draws a helical wheel diagram for a protein sequence.
For the automatically created domain, where the EDAM terminology is used for the constraint formulation, we use the constraints “Useshowreport as last service in solution”(showreport being a concrete service) and “Enforce the use of moduleprotein secondary structure prediction” (abstract service), as there are no EDAM terms that directly correspond to the abstract service groups that we defined in the manual domain setup. The right side of Figure 8 shows one of the possible results of this synthesis run, where the servicesgarnier (a service predicting protein secondary structures) andshowreport (simply displaying the textual content of, e.g., the EMBOSS report that is produced bygarnier) have been inserted. The different constraints and the different corresponding results that we encounter in this example show that not only the process specification and the resulting service sequences, but also the constraint formulation itself (as part of the specification) depend on the concrete structure of the domain model.
Our experiments demonstrate that comprehensive domain models in combination with adequate synthesis methodology greatly simplify working with the large, heterogeneous, and hence manually intractable EMBOSS collection. However, they also show that with the information that can be derived from the (current) ACD files and EDAM ontology alone, some essential connections between services cannot be recognized. A striking example is the disconnect between the alignment services (e.g.,edialign, emma) and alignment visualizers such asshowalign. Due to the reason that the alignment services’ outputs are simply described assequence records whereas some kind ofsequence alignment data would make a suitable input forshowalign, an artificial separation of actually compatible types has been introduced. This reveals that although the descriptions of the individual components are technically sound and several ontological terms are well defined, they are not (yet) sufficiently synchronized with respect to the automatic construction of executable workflows. Thus, automatically created domain models should be manually revised in order to detect and bridge essential gaps.
Clearly, adequate domain modeling requires to incorporate as much domain knowledge as possible, far beyond the mere technical aspects of the different types and services. Finding or defining semantically appropriate service and type descriptions is a difficult task , which is common among all approaches to (semi-) automatically dealing with the large number of distributed, heterogeneous services that are available in the bioinformatics application domain. Projects like BioMoby , SADI , BioCatalogue , the (my)Grid Ontology , and the EDAM Ontology address this issue by providing knowledge bases that particularly capture bioinformatics data types and services. We plan to integrate (more of) their services and domain knowledge in the scope of future case studies with Bio-jETI and PROPHETS. The resulting domains will contain far more heterogeneous services than the comparatively ’closed’ EMBOSS domain that we used for the current study, creating new challenges for the client-side software, challenges that our methods are designed for.
Bio-jETI [17, 30] is a framework for model-based, graphical design, execution and management of bioinformatics analysis processes. It has been used in a number of different bioinformatics projects [31–34] and is continuously evolving as new service libraries and service and software technologies become established.
Technically, Bio-jETI is based on the jABC modeling framework  as an intuitive, graphical user interface and the jETI electronic tool integration platform  for dealing with remote services. Using the jABC technology, process models are constructed graphically by placing services on a canvas and connecting them according to the flow of control. jABC process models are directly executable by an interpreter component, and they can be compiled into a variety of target languages via the Genesys code generation framework .
In , we presented our approach to semantics-based service composition in the Bio-jETI platform. By integration of automatic service composition functionality into an intuitive, graphical process management framework, we maintained the usability of the latter for semantically aware workflow development. Furthermore, we could integrate services and domain knowledge from any kind of heterogeneous resource at any location, and were not restricted to any semantically annotated services of a particular platform. For the work presented in this paper, we used the PROPHETS (Process Realization and Optimization Platform using a Human-readable Expression of Temporal-logic Synthesis) extension of the Bio-jETI platform that simplifies workflow development in order to even reach biologists without programming background. PROPHETS seamlessly integrates automatic service composition into the jABC. It enhances the previous approaches by including more formal methodology, but with less of it being required for the user to know, thus enabling the system to be used by a wider range of users. These enhancement are in particular:
● visualized/graphical semantic domain modeling.
● loose specification within the process model.
● non-formal specification of constraints using natural language templates, and
● automatic generation of model checking formulas (to check global properties processes).
Two roles are designed for using this extension. The domain modeler provides information on available services and a semantic classification of these services and their input and output types. The workflow designer is the one who uses the available services to model the processes. The following two subsections deal with one of those roles, respectively, while the subsequent ones give more detail on the synthesis method and verification concerns.
The basis of the domain model is built by meta-information on services, which enhances the definition of jABC services regarding their abstract input/output behavior. Throughout our framework types are represented by symbolic names, thus abstracting from concrete implementations. Each service is characterized by two subsets of the set of all symbolic type names, namely input types and output types. The meta-information is stored as a separate file within the current project’s directory, which allows for the usage of a specialized nomenclature for different jABC projects, even though the included services might be the same.
Furthermore, the services and types can be classified using taxonomies. These taxonomies are expressed as ontologies in OWL format, where the concept Thing denotes the most general type or service, respectively. Using the is-a relation, additional semantic classifications can be added into the domain. The actual types and services are then represented as individuals that are related to one or more of those classifications by the instance-of relation. Although we also provide a seamlessly integrated graphical editor for these OWL files (Figures 1,2, 4 and 5), the domain modeler may use any OWL tool of his preference.
Finally, there might be domain specific knowledge like ordering constraints on services or general compatibility information. This knowledge must be formalized appropriately. Basically, there are two possible options to do so: Either the domain expert expresses model checking formulas that must hold for every process within the project, or he defines global constraints that are used for every synthesis. Furthermore, the system that allows formulae to be expressed with natural language templates, can be extended to the needs of the specific domain.
Each loosely specified branch’s synthesis can be enhanced by additional constraints. As we will not expect common process designers to deal with this formal specification, we provide means to express constraints using a system that is based on templates in natural language. The user chooses a restricting concept and then simply has to fill in a cloze text with prepared values ("Wizard Step 1” in Figure 9). The possible values for the cloze text fields are automatically extracted from the domain (i.e. service definition and taxonomies).
The algorithm  that we use to complete a loosely specified process to be fully executable takes two aspects into account: On the one hand, the process must be a valid execution regarding type consistency, on the other hand, the constraints specified by the process designer must be met.
The specification formula is the second aspect. It describes all sequences of services that meet the individual workflow specification, but without taking care of actual executability concerns. As the explicit representation of all those possible sequences might be extremely large, it also is not explicitly built, but given declarative as a formula in SLTL (Semantic Linear Time Logic) , an extension of the commonly known propositional linear-time logic (PLTL). This formula is created by conjunction of all constraints, i.e. the constraints that are specified by the process designer for the current loosely specified branch and the ones that were globally specified by the domain modeler.
To start the search for solutions, the synthesis algorithm requires an initial state (i.e. a set of start types). In contrast to our previous approach , where these start types had to be specified manually, they are now determined automatically according to preceding services using data-flow analysis methods. The types that are created on the execution path from the workflow’s initial node to the currently synthesized loosely specified branch are taken as start types. If due to branching in the model multiple paths are possible, the largest set of types that is consistent with each of those paths is taken.
Given these specifications, the synthesis algorithm performs a parallel evaluation of the configuration universe and the specification formula to search for paths that are consistent with the configuration universe and fulfill the SLTL formula. Each of those paths is a valid solution that may replace the loosely specified branch. The framework currently supports two possibilities to choose one of the solutions: Either the shortest solution is chosen automatically or the user is queried to select one. However, the general architecture of the framework allows for the easy integration of other solution choosing mechanisms, for instance based on some cost function in order to obtain the cheapest solution.
List of abbreviations used
- ACD :
Ajax Command Definition
- API :
Application Programming Interface
- BLAST :
Basic Local Alignment Search Tool
- DDBJ :
DNA Data Bank of Japan
- EDAM :
EMBRACE Data And Methods ontology
- EMBOSS :
European Molecular Biology Open Software Suite
- GUI :
Graphical User Interface
- HMM :
Hidden Markov Model
- HTN :
Hierarchical Task Network
- ID :
- jABC :
Java Application Building Center
- jETI :
Java Electronic Tool Integration
- OBO :
Open Biomedical Ontologies
- OWL :
Web Ontology Language
- PLTL :
Propositional Linear Time Logic
- PROPHETS :
Process Realization and Optimization Platform using a Human-readable Expression of Temporal-logic Synthesis
- SADI :
Semantic Automated Discovery and Integration
- SLTL :
Semantic Linear Time Logic
This article has been published as part of Journal of Biomedical Semantics Volume 2 Supplement 1, 2011: Semantic Web Applications and Tools for Life Sciences (SWAT4LS), 2009. The full contents of the supplement are available online at http://www.jbiomedsem.com/supplements/2/S1
- Berners-Lee T, Hendler J, Lassila O: The Semantic Web - A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities. Scientific American. 2001, 284 (5): 34-43. 10.1038/scientificamerican0501-34.View ArticleGoogle Scholar
- Cannata N, Schroder M, Marangoni R, Romano P: A Semantic Web for bioinformatics: goals, tools, systems, applications. BMC Bioinformatics . 2008, 9 (Suppl 4): S1-10.1186/1471-2105-9-S4-S1.View ArticleGoogle Scholar
- Wilkinson MD, Links M: BioMOBY: an open source biological web services proposal. Briefings in Bioinformatics. 2002, 3 (4): 331-41. 10.1093/bib/3.4.331. [PMID: 12511062]View ArticleGoogle Scholar
- Wilkinson MD, Vandervalk B, McCarthy L: SADI Semantic Web Services - ’cause you can’t always GET what you want!. Proceedings of the IEEE Services Computing Conference: 7-11 December 2009, Singapore. APSCC 2009. 2009, IEEE Asia-Pacific, 13-18.Google Scholar
- Goble CA, Belhajjame K, Tanoh F, Bhagat J, Wolstencroft K, Stevens R, Nzuobontane E, McWilliam H, Laurent T, Lopez R: BioCatalogue: A Curated Web Service Registry For The Life Science Community. In 3rd International Biocuration Conference: 16-18 April 2009, Berlin. 2009, Nature Precedings, Nature Publishing Group, [http://precedings.nature.com/documents/3132/version/1]Google Scholar
- Wolstencroft K, Alper P, Hull D, Wroe C, Lord PW, Stevens RD, Goble CA: The (my)Grid ontology: bioinformatics service discovery. International Journal of Bioinformatics Research and Applications. 2007, 3 (3): 303-325. 10.1504/IJBRA.2007.015005. [PMID: 18048194]View ArticleGoogle Scholar
- EMBRACE Ontology for Data and Methods (EDAM). [http://edamontology.sourceforge.net/]
- DiBernardo M, Pottinger R, Wilkinson M: Semi-automatic web service composition for the life sciences using the BioMoby semantic web framework. Journal of Biomedical Informatics. 2008, 41 (5): 837-847. 10.1016/j.jbi.2008.02.005. [PMID: 18373957]View ArticleGoogle Scholar
- Oinn T, Addis M, Ferris J, Marvin D, Senger M, Greenwood M, Carver T, Glover K, Pocock MR, Wipat A, Li P: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics. 2004, 20 (17): 3045-3054. 10.1093/bioinformatics/bth361.View ArticleGoogle Scholar
- Carrere S, Gouzy J: REMORA: a pilot in the ocean of BioMoby web-services. Bioinformatics (Oxford, England). 2006, 22 (7): 900-901. 10.1093/bioinformatics/btl001. [PMID: 16423924]View ArticleGoogle Scholar
- Kwon Y, Shigemoto Y, Kuwana Y, Sugawara H: Web API for biology with a workflow navigation system. Nucleic Acids Res. 2009, 37 (Web Server issue): W11-W16. 10.1093/nar/gkp300.View ArticleGoogle Scholar
- Sutherland K, McLeod K, Ferguson G, Burger A: Knowledge-driven enhancements for task composition in bioinformatics. BMC Bioinformatics. 2009, 10 (Suppl 10): S12-10.1186/1471-2105-10-S10-S12.View ArticleGoogle Scholar
- Bausch W, Pautasso C, Schaeppi R, Alonso G: BioOpera: Cluster-aware Computing. In Proceedings of the 4th IEEE International Conference on Cluster Computing. 2002, 99-106. [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.118.5158]Google Scholar
- Eker J, Janneck J, Lee E, Liu J, Liu X, Ludvig J, Neuendorffer S, Sachs S, Xiong Y: Taming heterogeneity - the Ptolemy approach. Proceedings of the IEEE. 2003, 91: 127-144. 10.1109/JPROC.2002.805829.View ArticleGoogle Scholar
- Altintas I, Berkley C, Jaeger E, Jones M, Ludäscher B, Mock S: Kepler: An Extensible System for Design and Execution of Scientific Workflows. In SSDBM. 2004, 21-23. [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.5.9905]Google Scholar
- Taylor I, Shields M, Wang I, Harrison A: The Triana Workflow Environment: Architecture and Applications. Workflows for e-Science. 2007, Secaucus, NJ, USA: Springer, New York, 320-339. full_text.View ArticleGoogle Scholar
- Margaria T, Kubczak C, Steffen B: Bio-jETI: a service integration, design, and provisioning platform for orchestrated bioinformatics processes. BMC Bioinformatics. 2008, 9 (Suppl 4): S12-10.1186/1471-2105-9-S4-S12. [PMID: 18460173 PMCID: 2367639]View ArticleGoogle Scholar
- Lamprecht AL, Margaria T, Steffen B: Bio-jETI: a framework for semantics-based service composition. BMC Bioinformatics. 2009, 10 (Suppl 10): S8-10.1186/1471-2105-10-S10-S8.View ArticleGoogle Scholar
- Lamprecht AL, Naujokat S, Steffen B, Margaria T: Semantics-Based Composition of EMBOSS Services with Bio-jETI. Proceedings of the 2nd Workshop on Semantic Web Applications and Tools for Life Sciences (SWAT4LS 2009): 20 November 2009, Amsterdam. Edited by: Marshall MS, Burger A, Romano P, Paschke A, Splendiani A. 2009, CEUR Workshop Proceedings, 559:Google Scholar
- Rice P, Longden I, Bleasby A: EMBOSS: the European Molecular Biology Open Software Suite. Trends in Genetics: TIG. 2000, 16 (6): 276-7. 10.1016/S0168-9525(00)02024-2. [PMID: 10827456]View ArticleGoogle Scholar
- EMBOSS Homepage. [http://emboss.sourceforge.net/index.html]
- Eddy SR: Profile hidden Markov models. Bioinformatics (Oxford, England). 1998, 14 (9): 755-763. 10.1093/bioinformatics/14.9.755. [PMID: 9918945]View ArticleGoogle Scholar
- HMMER: biosequence analysis using hidden Markov models. [http://hmmer.janelia.org/]
- EMBOSS: Applications. [http://emboss.sourceforge.net/apps/release/6.2/emboss/apps/]
- EMBASSY Applications. [http://emboss.sourceforge.net/apps/release/6.2/embassy/index.html]
- Web Services for EMBOSS-6.1.0 applications. [http://www.ebi.ac.uk/soaplab/]
- Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W, Goldberg LJ, Eilbeck K, Ireland A, Mungall CJ, Leontis N, Rocca-Serra P, Ruttenberg A, Sansone S, Scheuermann RH, Shah N, Whetzel PL, Lewis S: The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotechnol. 2007, 25 (11): 1251-1255. 10.1038/nbt1346.View ArticleGoogle Scholar
- EMBOSS: AJAX Command Definition (ACD files). [http://emboss.sourceforge.net/developers/acd/]
- Lord P, Bechhofer S, Wilkinson MD, Schiltz G, Gessler D, Hull D, Goble C, Stein L: Applying Semantic Web Services to Bioinformatics: Experiences Gained, Lessons Learnt. In The Semantic Web - ISWC 2004. 2004, 350-364. full_text. [http://www.springerlink.com/content/1b7b409w0lw92326]View ArticleGoogle Scholar
- Bio-jETI homepage. [http://biojeti.cs.tu-dortmund.de/]
- Margaria T, Kubczak C, Njoku M, Steffen B: Model-based Design of Distributed Collaborative Bioinformatics Processes in the jABC. Procedings of 11th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS’06): 15-17 August 2006. Stanford, California.., Los Alamitos, CA, USA. 2006, IEEE Computer Society, 169-176.Google Scholar
- Kubczak C, Margaria T, Fritsch A, Steffen B: Biological LC/MS Preprocessing and Analysis with jABC, jETI and xcms. Proceedings of the 2nd International Symposium on Leveraging Applications of Formal Methods, Verification and Validation (ISoLA 2006): 15-19 November 2006, Paphos, Cyprus. 2006, IEEE Computer Society, 308-313.Google Scholar
- Lamprecht A, Margaria T, Steffen B, Sczyrba A, Hartmeier S, Giegerich R: GeneFisher-P: variations of GeneFisher as processes in Bio-jETI. BMC Bioinformatics. 2008, 9 (Suppl 4): S13-10.1186/1471-2105-9-S4-S13. [PMID: 18460174], [http://www.ncbi.nlm.nih.gov/pubmed/18460174]View ArticleGoogle Scholar
- Lamprecht A, Margaria T, Steffen B: Seven Variations of an Alignment Workflow - An Illustration of Agile Process Design and Management in Bio-jETI. Bioinformatics Research and Applications, Volume 4983 of LNBI, Atlanta, Georgia. 2008, Springer, 445-456. [http://dx.doi.org/10.1007/978-3-540-79450-9_42]Google Scholar
- Steffen B, Margaria T, Nagel R, Jörges S, Kubczak C: Model-Driven Development with the jABC. 2006, Springer Berlin / Heidelberg, 4383: 92-108. of LNCSGoogle Scholar
- Margaria T, Nagel R, Steffen B: jETI: A Tool for Remote Tool Integration. Tools and Algorithms for the Construction and Analysis of Systems, Volume 3440/2005 of LNCS. 2005, Springer Berlin/Heidelberg, 557-562. full_text. [http://www.springerlink.com/content/h9x6m1x21g5lknkx]View ArticleGoogle Scholar
- Jörges S, Margaria T, Steffen B: Genesys: service-oriented construction of property conform code generators. ISSE. 2008, 4 (4): 361-384.Google Scholar
- Steffen B, Margaria T, Freitag B: Module Configuration by Minimal Model Construction. Tech. rep., Fakultät für Mathematik und Informatik, Universität Passau. 1993Google Scholar
- Clarke EM, Grumberg O, Peled DA: Model Checking. 1999, The MIT PressGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.