===Provenance Collection Support in the Kepler Scientific Workflow System=== Link: [[http://www.ipaw.info/ipaw06/proceedings/CameraReady_s5_2.pdf|Provenance Collection Support in the Kepler Scientific Workflow System]] The focus of this paper is on efficiently rerunning workflows by using results from previous runs. Kepler is a system for specifying and running workflows. A workflow consists of actors (transformations) connected by directed edges. Kepler provides a GUI with draggable elements that makes it easy to construct workflows by using actors from a library. When a workflow is run, intermediate results are passed between actors. Provenance in Kepler consists of these intermediate results passed between actors. A designer of a workflow may evolve the workflow over time, perhaps by changing the parameters of actors, adding or removing actors, or changing how actors are connected. When we run a new version of a workflow, we would like to execute the workflow efficiently by reusing the provenance from previous reruns. One subtlety is that some actors are non-cachable. An example of a non-cachable actor is one that downloads data from a remote database. This actor is non-cachable because the actor doesn't necessarily return the same result when rerun with the same parameters. When doing a "smart rerun", we can either rerun the non-cachable actor or not, depending on the importance of freshness.