Skip to content
Ling-Xiao Yang edited this page Jan 20, 2016 · 5 revisions

The execution of a workflow is implemented by creating and executing a WorkflowRun instance in order to conform to RESTful API.

WorkflowRun Creation

The creation of a WorkflowRun is implemented by following subroutines.

Validate Resource Assignments

The WorkflowRun must be created with a resource assignment dictionary, mapping an InputPort URL to a list of Resource URLs. Following conditions should be satisfied:

  • all unsatisfied InputPorts must be provided with Resources.
  • there must be one "multiple resource collection".
  • the resource must be ready (has compat_resource_file)
  • the resource type of related InputPort should agree with assigned Resources.

Get "Singleton" RunJobs

A "singleton" in this context is a WorkflowJob that will have exactly one RunJob associated with it in a WorkflowRun. That is, it does not exist in an execution path that is involved with a resource_collection that has multiple Resources. The output of this singleton will be used as many times as required for the WorkflowRun.

  • singleton_workflowjobs <-- all WorkflowJobs
  • if there exists a resource_collection in Workflow.resource_collections[] where the size of resources[] > 1
  • traverse down graph (i.e. from input_port to output_port); for each node (i.e. WorkflowJob) visited, remove it from singleton_workflowjobs
  • return singleton_workflowjobs

Get "end-point" WorkflowJobs

An "end-point" is simply a stopping point in the execution path. End-points are used as the starting place for building RunJobs.

  • endpoint_workflowjobs <-- []
  • for each WorkflowJob
  • if an output_port is not referenced in Connections
    • add WorkflowJob to endpoint_workflowjobs
  • return endpoint_workflowjobs

Create RunJob

This recursively creates RunJobs. Assume the input WorkflowJob is A.

  • if WorkflowJob A has no associated RunJob in workflowjob_runjob_map
  • create RunJob A
  • create Resource entries (setting origin and resource_type accordingly) for RunJob A's outputs[] and add to RunJob.outputs[]
  • for each input_port with associated Connection that points to a WorkflowJob's (call it B) output_port
    • if WorkflowJob B has no associated RunJob in workflowjob_runjob_map
      • run Create RunJob on Workflow B
    • get RunJob for WorkflowJob B (call it RunJob B)
    • add input for this input_port, using the information from the appropriate output of RunJob B
  • for each input_port with associated resource_assignment
    • add input for this input_port, using the information from the appropriate resource_assignment
  • add WorkflowJob A <-> RunJob A to workflowjob_runjob_map

Create WorkflowRun

This goes through the end-points and builds RunJobs upwards. Those WorkflowJobs that are "singletons" have references kept for them so they are not recreated.

The outer loop iterates over a resource_assignment that has multiple Resources.

  • endpoint_workflowjobs <-- Get "End-point" WorkflowJobs
  • singleton_workflowjobs <-- Get "Singleton" WorkflowJobs
  • workflowjob_runjob_map <-- []
  • if there exists a resource_collection in Workflow.resource_collections[] where the size of resources[] > 1, execute the following loop using a unique Resource from the resources at each iteration; else, execute the loop as is
    • for each endpoint_workflowjobs
      • run Create RunJob on WorkflowJob
    • clear workflowjob_runjob_map of those entries associated with WorkflowJobs that do not exist in singleton_workflowjobs
Clone this wiki locally