Phylotastic/Architecture

From Evolutionary Interoperability and Outreach
Revision as of 18:50, 5 June 2012 by Hilmar (talk | contribs)
Jump to navigation Jump to search

Our architecture is based on the Model-View-Controller design pattern and takes the following modules into consideration:

  • TNRS
  • Topology prune/graft
  • Tree Store
  • Syntax Format Converter
  • Branch length annotator
  • Logger

The architecture diagram below presents the interaction between the various modules. Each module will be treated as a "black box" in this architecture and the only elements we are concerned about here will be the interoperation between the modules.

These should fuel the input/output specifications for each module. The architecture also clarifies what is passed by reference (blue arrows) from what is passed as values (black arrows). As an example, mega trees should not be passed directly to the controller but, instead, only the reference to those mega-trees should be sent. These references are then passed to the topology module, which will use those trees.

Controller decision elements

All steps may optionally receive user feedback

  • Start
    • list of name strings [mandatory user input]
    • TNRS sources
    • TNRS “knobs” (“fuzziness” etc.)
  • Post TNRS
    • Choices about unresolved names
  • Megatree retrieval
    • (mega-)tree store source
    • automated tree query for applicable trees (per treestore) with user-supplied parameters
    • (mega-)tree filter/selection criteria (e.g. degree of overlap)
  • Pre-Topology
    • application of branch lengths?
  • Topology
    • (see Tolopolgy, below)
  • Post-topology
    • application of branch lengths?

Topology Module

_per megatree in:_

Input

  • List of names [mandatory; post-cleaning]
    • with Taxonomic cues/guides [optional; can also be auto-discovered]
    • Megatree choice
  • Configuration:
    • Grafting policy
      • choices of insertion of non-matching terminals
        • sister to random terminal
        • random node in matching clade
        • basal
        • conservatively collapse clade
      • Pruning policy
        • restrict tips to names given, or return all tips in minimum spanning clade
        • retain or delete out-degree one nodes
        • how to handle metadata associated with nodes/edges that have been deleted

Output

  • One or more tree(s)
    • topology
    • node-by-node metadata [as per megatree, optional]
  • list of non-matching taxa
    • logging record to logger

Branch lengths

Possible strategies:

  • NPRS
    • Based on input BLs
    • Default (no input BLs)
  • Node-age constrained equidistant adjustment (BLADJ)
  • refer to BL group (Congruifier)

Topology Scenario and Dependencies

Topology services depend on two other services: TNRS and TREE SOURCE User starts by submitting a list of names and a set of configuration elements (e.g. sources to use, knobs, etc). TNRS returns list of taxa and their URI. User then decides whether resolved names are correct. This list of chosen URI taxa will be sent to the tree source, along with a selection of mega tree sources. The tree source returns the URI of a set of megatrees that are applicable for the taxa submitted. The user selects, from these trees, the list of trees that he/she wants to use. This list, along with the taxa uri, is submitted to topology module, along with a set of configuration instructions. Phylotatic tree (or trees) are returned to the user, including node-by-node metadata information.

Error creating thumbnail: Unable to save thumbnail to destination