Relevant Papers

Applications

Summary

Details

Workflow Specification
- directed graph similar to flow networks
-- nodes indicate processes and have unique ids
-- single source (start process) and single sink (final process)
-- directed edge indicates that in a given run, destination node process
   is executed only after source node process
-- loops - dotted backarrow from one node to another
-- fork - dotted oblong surrounding a subgraph - can fork multiple runs 
   of the same subgragh to be run in parallel
-- multiple outgoing edges - non-exclusive choice

Run - Execution Instance
- A non-deterministic recursive function called an 'execution function' generates individual runs from specifications
- In a run, loops are unrolled and the number of fork (parallel) executions 
  is given explicitly.
- Even if the specification has cycles, a valid 
  run is always acyclic, since we unfold the cycles in the 
  specification to capture the sequential order of all iterations in 
  a workflow run. Consequently, the node labels in a run R are 
  not necessarily unique.
- So annotate each node with an instance id to make labels unique
Comparing Runs
- find edit distance and a path edit script (consisting several edit operations)
  between two valid runs of the same specification.
- edit operations: path insertion, path deletion
- cost model: 
-- cost of edit script = sum of cost of individual operations.
-- cost of individual operation can be determined by application
--- can depend on both the length of the path being edited and the labels on either side of the path
--- must be a 'distance metric'

- similar to computing edit distance between ordered trees (polynomial), 
  unordered (NP hard), graphs (NP hard)
- restrict to series-parallel graphs - SP Graphs

- what are SP Graphs?
-- directed multigraph, single source, single sink
-- recursive definition:
--- basic SPG: one source, one sink
--- series composition: sink of one becomes source of another
--- parallel composition: sources of both merged, so are sinks
- more restrictions
-- SP Graph 'structured' with well-nested forks and loops
-- more details about what well-nested forks and loops are (not particularly interesting)
--- define 'laminar family': collection of subsets such that for any pair
    of subsets, one is a subset of the other, or they are disjoint
--- SP graph specification is (G, F), F is a laminar family of series subgraphs of G
    for allowed fork executions

- can construct SP Tree specification from SP Graph (G, F)
- another execution function to generate SP Tree run from SP Tree specification
- Provide algorithm to compute edit distance and edit script between two SP Trees