Gregory Todd Williams SPARQL BGP Optimization Presentation Jesse Weaver
From Semantic Portal Wiki
- Question is for the Presentation: Gregory Todd Williams SPARQL BGP Optimization Presentation
- Question is asked by: Jesse Weaver
- The Question is: What is gained (in terms of query optimization) by creating "joined triple patterns" for triple patterns that share bound components? (See question page for more details.)
- Answer: I'm not sure there is anything to be gained here. The actual join of two triple patterns based solely on a shared bound term turns out to be a cartesian product (since the join occurs on the variable bindings of the triple patterns, not the triple patterns themselves). The only thing I can think of is that if the underlying implementation is known, this type of join might be relevant if data locality is considered (if all triples with predicate rdf:type are stored together, performing this join may be able to use disk and/or processor caches).
Section 3.1 states:
"A triple pattern pair shares at least one bound or unbound component. The subject, predicate, and object are the components of a triple (pattern) [9]. In this paper, we refer to triple pattern pairs as joined triple patterns. Hence, two triple patterns with a shared variable are joined as well as two triple pattern with, for instance, the same subject URI."
What is gained (in terms of query optimization) by creating "joined triple patterns" for triple patterns that share bound components? Section 6.1 states (concerning LUBM query 2):
"If the BGP graph abstraction would not define a join for bound predicates, the algorithm would guarantee to not choose execution plans with Cartesian products as intermediate result sets. It was a design choice to define a join for every bound or unbound pair of triple pattern components to be as general as possible. In future, we might choose to ignore bound predicate joins directly in the BGP abstraction process."
Clearly there is a disadvantage, but the only advantage mentioned is "to be as general as possible." What does this generality gain us?
| Answer | I'm not sure there is anything to be gaine … I'm not sure there is anything to be gained here. The actual join of two triple patterns based solely on a shared bound term turns out to be a cartesian product (since the join occurs on the variable bindings of the triple patterns, not the triple patterns themselves). The only thing I can think of is that if the underlying implementation is known, this type of join might be relevant if data locality is considered (if all triples with predicate rdf:type are stored together, performing this join may be able to use disk and/or processor caches). able to use disk and/or processor caches). |
| Question asked | What is gained (in terms of query optimization) by creating "joined triple patterns" for triple patterns that share bound components? (See question page for more details.) |
| Question asked by | Jesse Weaver + |
| Question for the Presentation | Gregory Todd Williams SPARQL BGP Optimization Presentation + |

