- directed graph similar to flow networks -- nodes indicate processes and have unique ids -- single source (start process) and single sink (final process) -- directed edge indicates that in a given run, destination node process is executed only after source node process -- loops - dotted backarrow from one node to another -- fork - dotted oblong surrounding a subgraph - can fork multiple runs of the same subgragh to be run in parallel -- multiple outgoing edges - non-exclusive choice
- A non-deterministic recursive function called an 'execution function' generates individual runs from specifications - In a run, loops are unrolled and the number of fork (parallel) executions is given explicitly. - Even if the specification has cycles, a valid run is always acyclic, since we unfold the cycles in the specification to capture the sequential order of all iterations in a workflow run. Consequently, the node labels in a run R are not necessarily unique. - So annotate each node with an instance id to make labels unique
- find edit distance and a path edit script (consisting several edit operations) between two valid runs of the same specification. - edit operations: path insertion, path deletion - cost model: -- cost of edit script = sum of cost of individual operations. -- cost of individual operation can be determined by application --- can depend on both the length of the path being edited and the labels on either side of the path --- must be a 'distance metric' - similar to computing edit distance between ordered trees (polynomial), unordered (NP hard), graphs (NP hard) - restrict to series-parallel graphs - SP Graphs - what are SP Graphs? -- directed multigraph, single source, single sink -- recursive definition: --- basic SPG: one source, one sink --- series composition: sink of one becomes source of another --- parallel composition: sources of both merged, so are sinks - more restrictions -- SP Graph 'structured' with well-nested forks and loops -- more details about what well-nested forks and loops are (not particularly interesting) --- define 'laminar family': collection of subsets such that for any pair of subsets, one is a subset of the other, or they are disjoint --- SP graph specification is (G, F), F is a laminar family of series subgraphs of G for allowed fork executions - can construct SP Tree specification from SP Graph (G, F) - another execution function to generate SP Tree run from SP Tree specification - Provide algorithm to compute edit distance and edit script between two SP Trees