Efficient Provenance Storage (SIGMOD 2008) – Adriane Chapman, H.V. Jagadish, Prakash Ramanan
In all of these systems, size of provenance can grow many times larger than base data (since provenance is very fine grained, # of operations of base data is large etc.)
Here the edges(bottom up) represent the workflow. What the following example says is that some scientist first annotated PubMed article numbered 16524875 via CurateHPRD manipulation to give input data I. This data was mapped using schema mapping MHPRD to suit the scientist’s own schema. The output data was I’. From this data, information about molecule named ABC1, with ID 095477 was collected.