===Data lineage model for Taverna workflows with lightweight annotation requirements=== Link: [[http://www.sci.utah.edu/ipaw2008/papers/Paper%207/ipaw08.pdf|Data lineage model for Taverna workflows with lightweight annotation requirements]] The focus of this paper is on annotating workflows with more precise and less noisy lineage. We can always reconstruct the provenance of a workflow data product from the workflow execution trace. However, there are two highlighted problems with the default provenance: 1) Lack of precision: We assume that most of the transformations in our workflow are 'black-box', meaning that the transformations do not automatically store fine-grained provenance. So by default, all we can state for an output item is that its lineage consists of all input items from the input data set. 2) Noisy: Some of the transformations in our workflow may not be important from a provenance standpoint. An example of a transformation that we may want to ignore is a 'string-to-int type-conversion' transformation. This paper gives two ways to annotate workflows to make the lineage more useful. 1) Instance-level lineage: One way is to annotate output items with more specific lineage. 1-1 transformations and aggregations are two classes of transformations for which the user can specify more specific lineage. 2) Ignore transformations: The other way is to ignore transformations that are marked as 'insignificant'. Finally, the paper notes if a transformation is deterministic, we can always trace the lineage lazily.