Gregory Todd Williams SPARQL BGP Optimization Presentation
From Semantic Portal Wiki
Presentation given at CSCI 6966 Advanced Semantic Web (Fall 2008) - Lesson 5
- Speaker: Gregory Todd Williams
- Title: SPARQL Basic Graph Pattern Optimization Using Selectivity Estimation
- Authors: Markus Stocker, Andy Seaborne, Abraham Bernstein, Christoph Kiefer, Dave Reynolds
- Conference: WWW2008
- URL: http://www2008.org/papers/pdf/p595-stocker1.pdf
- Date of Presentation: 2008/09/25
Questions
| ID | Question | Name | Answer |
|---|---|---|---|
| Greg Ankesh 1 |
|
Ankesh Khandelwal | |
| Gregory Todd Williams SPARQL BGP Optimization Presentation Jesse Weaver | What is gained (in terms of query optimization) by creating "joined triple patterns" for triple patterns that share bound components? (See question page for more details.) | Jesse Weaver | |
| Gregory Todd Williams SPARQL BGP Optimization Presentation Jesse Weaver 2 | Why are the selected queries inconsistent between figures/tables? (See question page for more details.) | Jesse Weaver | |
| Gregory Todd Williams SPARQL BGP Optimization Presentation Joshua Shinavier 1 | What is it about the presented approach which makes it appropriate only for in-memory RDF repositories? I don't understand why the same arguments and techniques would not apply to disk-based stores. | Joshua Shinavier | |
| Gregory Todd Williams SPARQL BGP Optimization Presentation Joshua Taylor 1 | It seems that 3.2 Optimization Algorithm is simply building a minimum spanning tree, where edges are selected based are weighted based on estimated selectivity (which aren't discussed until later). Is this correct, or is there more to it? | Joshua A. Taylor | |
| Gregory Todd Williams SPARQL BGP Optimization Presentation Joshua Taylor 2 | What purpose does the graph representation (as presented in 3.1 Preliminaries and 3.2 BGP Abstraction serve? Its used, to some extent in 3.2 Optimization Algorithm, but it seems that two pages exposit a representation that isn't really exploited afterward. | Joshua A. Taylor | |
| Gregory Todd Williams SPARQL BGP Optimization Presentation Joshua Taylor 3 | The authors state in 1. INTRODUCTION that "We focus on the evaluation of the presented optimization techniques without comparing the figures with the performance of alternative implementations." What value then ought we to ascribe to the work the authors performed? If I were to start with an exponential-time algorithm, and invent an optimization that yields an polynomial-time algorithm, and it will seem impressive, but if someone else has recognized that the problem can be solved in linear or constant time, it would be somewhat suspicious if I was not "comparing the figures with the performance of 'alternative implementations'" (single/scare quotes added for emphasis). | Joshua A. Taylor | |
| Gregory Todd Williams SPARQL BGP Optimization Presentation Shangguan 1 | In section 4.1, it is stated that "Triple pattern components may be bound (i.e. concrete)" or unbound (i.e. variable)... The case of unbound components is trivial as they do not affect the required statistics." Why? | Zhenning Shangguan | |
| Gregory Todd Williams SPARQL BGP Optimization Presentation Shangguan 2 | In section 5.1, the author argues that "the selectivity of a triple pattern is estimated by the formula sel(t)=sel(s)*sel(p)*sel(o)...", and later he states that "...Note that this formulationonly approximates sel(t) as it implicitly assumes that sel(s), sel(p), and sel(o) are statistically independent, which they will not be in most cases." What's the point of saying this? If in practical the assumption of statistical independence between subject, predicate, and object does not always hold, does it mean that this fomula is of little use? And also, what about another formula of selectivity estimation which looks like this: sel(t)=c1*sel(s)+c2*sel(p)+c3*sel(o), where c1-c3 help to normalize sel(t) to 0-1? | Zhenning Shangguan | |
| Gregory Todd Williams SPARQL BGP Optimization Presentation Xixi Luo 1 | "4.3 Summary Features: the mumber of entries in the summary is a quadratic funtion of the number of distinct predicates f(n)=4n_2". why? | Xixi Luo | |
| SPARQL BGP opt Medha Q 1 | What is the size of the typical join pattern graph used for evaluation of the system? When authors mention that query 9 could not be completed for plan space evaluation, audience is curious to know how it looked. (See more comments). | Medha Atre |
Attendees
Facts about Gregory Todd Williams SPARQL BGP Optimization PresentationRDF feed
| A | Presentation +, and Presentation attended by Tim Lebo + |
| Conference | WWW2008 + |
| Date | 25 September 2008 + |
| Given at | CSCI 6966 Advanced Semantic Web (Fall 2008) - Lesson 5 + |
| Paper has author | Markus Stocker +, Andy Seaborne +, Abraham Bernstein +, Christoph Kiefer +, and Dave Reynolds + |
| Speaker | Gregory Todd Williams + |
| Title of paper | SPARQL Basic Graph Pattern Optimization Using Selectivity Estimation + |
| Url | http://www2008.org/papers/pdf/p595-stocker1.pdf + |

