Link to paper: http://www.eecs.harvard.edu/~syrah/pass/pubs/ipaw08.pdf

This is where PASSv2 starts

Provenance Data (at least... from the PASS perspective)

⁃	e.g. "siblings" -> what does this mean? it's unclear. two provenance objects could have the same parent, but different grandparents
⁃	could also have annotations
⁃	deciding what kinds of relationships to detect and label depends on goals an implementation
⁃	most people care only about ancestry, but we should support direct label capturing
⁃	need to follow paths whose structure is not known beforehand (e.g. "find me all the descendants of X)
⁃	need to support matching for regular expressions on path names (especially if we're allowing multi-layer provenance, where attribute naming conventions are not standardized)
⁃	need to support aggregation (i.e. find me all objects with at least 10 ancestors)
⁃	subqueries (find me the objects in graph A that mach those in graph B that are subject to these constraints)
* semi-structured data is a natural fit

Why Other Models Don’t Work

⁃	nodes and edges; get paths from crossing edge-list with itself multiple times
⁃	unwieldy, although possible
⁃	no way of comparing if two paths are the same, or if two paths lead to the same object
⁃	XML is hierarchical, but follows a tree model (i.e. no diamonds); vanilla implementation is unsuitable
⁃	XQuery path always holds the value of the object at the rightmost end; have to truncate to get intermediate results
⁃	XQuery paths are lists of nodes rather than lists of edges
⁃	possible to hack around until it works, but...
⁃	Resource Description Framework -> directed, labeled graphs
⁃	In theory, a perfect fit
⁃	but SPARQL has several shortcomings
⁃	doesn't support sub-queries
⁃	doesn't support constraints on path expressions
⁃	doesn't support path expressions of arbitrary length
⁃	... and more
⁃	several extensions for SPARQL (almost anything you can imagine, if I recall correctly)
⁃	but it's a pain to get all of them to work at once
⁃	can workaround, and may one day be suitable; current workarounds:
⁃	explicitly code the end-node in the query so you don't have to do a path search
⁃	build a full path by repetition (i.e. to find great grandparents, find all parents of x, all parents of parents of x, all parents of parents of parents of x).

PQL

 
panda/reading/passquerymodel.txt · Last modified: 2009/12/01 17:56 by reader
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki