Skip to main content

Table 3 Input and output data types relevant for phyloinformatic web services.

From: The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows

Inputs - The input data types defined here do not imply pass-by-value, and could be passed as an identifier:

One Tree

exactly one tree, which might function as a query topology, as an input for topology metric calculations, or as something for which associated data (matrices) and metadata might be retrieved

Pair of Trees

exactly two trees, for tree reconciliation (e.g. duplication inference) or for tree-to-tree distance calculations

Set of Trees

input for consensus calculations, or as query topologies

One OTU

exactly one OTU for which associated data (trees or matrices that contain it) and metadata might be retrieved

Pair of OTUs

exactly two OTUs, as input for topological queries (MRCA) and calculations (patristic distance)

Set of OTUs

input for topological queries (MRCA) and for trees or matrices that contain them, and metadata is retrieved

One Node

input for tree traversal operations (parent, children) and for which metadata might be retrieved

Pair of Nodes

input for topological queries (MRCA) and calculations (patristic distance)

Set of Nodes

input for topological queries (MRCA)

One Character

exactly one character (matrix column) for which calculations are performed (variability) and metadata is retrieved

Set of Characters

input as filter predicate, to retrieve OTUs that contain recorded states for the characters

One Character State Sequence

input for which metadata is retrieved

Pair of Character State Sequences

input for pairwise alignments, as input to calculate pairwise divergence

Set of Character State Sequences

input for multiple sequence alignment

Character State Matrix

input for inference (of one tree or set of trees), for calculations (average sequence divergence) and metadata retrieval

Outputs - In addition to the mirroring the inputs described above, some 'primitives' may be required:

Int

an integer, for things such as topology metrics (node counts) tree-to-tree distances (in branch moves) node distances (in number of nodes in between), character state counts, sequence divergence (substitution counts, site counts)

Float

a floating point value, for topology metrics (balance, stemminess, resolution) tree-to-tree distances (symmetric difference), patristic distance, sequence divergence

String

for metadata, e.g. descriptions

Stringvector

for metadata, e.g. a set of tags