Skip to main content

Advertisement

Table 3 Input and output data types relevant for phyloinformatic web services.

From: The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows

Inputs - The input data types defined here do not imply pass-by-value, and could be passed as an identifier:
One Tree exactly one tree, which might function as a query topology, as an input for topology metric calculations, or as something for which associated data (matrices) and metadata might be retrieved
Pair of Trees exactly two trees, for tree reconciliation (e.g. duplication inference) or for tree-to-tree distance calculations
Set of Trees input for consensus calculations, or as query topologies
One OTU exactly one OTU for which associated data (trees or matrices that contain it) and metadata might be retrieved
Pair of OTUs exactly two OTUs, as input for topological queries (MRCA) and calculations (patristic distance)
Set of OTUs input for topological queries (MRCA) and for trees or matrices that contain them, and metadata is retrieved
One Node input for tree traversal operations (parent, children) and for which metadata might be retrieved
Pair of Nodes input for topological queries (MRCA) and calculations (patristic distance)
Set of Nodes input for topological queries (MRCA)
One Character exactly one character (matrix column) for which calculations are performed (variability) and metadata is retrieved
Set of Characters input as filter predicate, to retrieve OTUs that contain recorded states for the characters
One Character State Sequence input for which metadata is retrieved
Pair of Character State Sequences input for pairwise alignments, as input to calculate pairwise divergence
Set of Character State Sequences input for multiple sequence alignment
Character State Matrix input for inference (of one tree or set of trees), for calculations (average sequence divergence) and metadata retrieval
Outputs - In addition to the mirroring the inputs described above, some 'primitives' may be required:
Int an integer, for things such as topology metrics (node counts) tree-to-tree distances (in branch moves) node distances (in number of nodes in between), character state counts, sequence divergence (substitution counts, site counts)
Float a floating point value, for topology metrics (balance, stemminess, resolution) tree-to-tree distances (symmetric difference), patristic distance, sequence divergence
String for metadata, e.g. descriptions
Stringvector for metadata, e.g. a set of tags