Jesse Weaver RDF Management Approaches Gregory Todd Williams 2
From Semantic Portal Wiki
- Question is for the Presentation: Jesse Weaver RDF Management Approaches
- Question is asked by: Gregory Todd Williams
- The Question is: In discussing query Q2 (and again for subsequent queries), it is noted that MonetDB chooses inefficient query execution plans "that mostly use fetch joins, involving merge joins only in a few cases." Based on this, the conclusion goes on to say "relational optimizers may have problems to cope with the specific challenges that arise in the context of RDF." Are we to assume that MonetDB's optimizer performs equally well (or better than) other relational optimizers (including those from industrial implementations such as Oracle and Sybase)? Since the paper purportedly attempts to evaluate storage schemes for RDF, wouldn't a poor optimizer be an orthogonal issue, and a proper evaluation of the storage schemes be based on the best QEP available for each query?
The Answer from Michael Schmidt
We agree that an evaluation based on (hand-optimized) QEPs would be the best way to do a comparison that is in the sense of the ideas presented in [1]. However, the latter work to a certain degree suggests that - using a column-store DBMS - an efficient RDF storage approach is easy to implement. In the absence of (column-store) alternatives we had to choose MonetDB, which performs suboptimal for certain classes of queries, e.g. does not exploit merge joins in all cases. Optimization by hand was - last but not least due to the fact that we are not MonetDB experts - not possible within reasonable time (note that MonetDB does not exhibit standard optimization techniques, but relies on complex algebraic optimizations on top of so-called Binary Association Tables (BATs)), so we ultimately had to live with the QEPs produced by the MonetDB optimizer. But again, we agree that a closer investigation with manually produced QEPs would be very interesting.
We also want to note that we tested other (non column-store) database systems (e.g. PostgreSQL and a commercial system), which were typically (this means, for most queries) significantly slower than MonetDB. For space limitations, we did not discuss these engines in the paper. Finally, it is worth mentioning that some DBMSs do not even implement merge joins, which directly disqualifies them.
| Question asked | In discussing query Q2 (and again for subs … In discussing query Q2 (and again for subsequent queries), it is noted that MonetDB chooses inefficient query execution plans "that mostly use fetch joins, involving merge joins only in a few cases." Based on this, the conclusion goes on to say "relational optimizers may have problems to cope with the specific challenges that arise in the context of RDF." Are we to assume that MonetDB's optimizer performs equally well (or better than) other relational optimizers (including those from industrial implementations such as Oracle and Sybase)? Since the paper purportedly attempts to evaluate storage schemes for RDF, wouldn't a poor optimizer be an orthogonal issue, and a proper evaluation of the storage schemes be based on the best QEP available for each query? on the best QEP available for each query? |
| Question asked by | Gregory Todd Williams + |
| Question for the Presentation | Jesse Weaver RDF Management Approaches + |

