James Michaelis Provenance Model
From Semantic Portal Wiki
Presentation given at CSCI 6966 Advanced Semantic Web (Fall 2008) - Lesson 6
- Speaker: James Michaelis
- Title: The Open Provenance Model
- Authors: Luc Moreau, Juliana Freire, Joe Futrelle, Robert E. McGrath, Jim Myers, Patrick Paulson
- Conference:
- URL: http://eprints.ecs.soton.ac.uk/14979/1/opm.pdf
- Date of Presentation: 2008/10/02
Questions
| ID | Question | Name | Answer |
|---|---|---|---|
| James Michaelis Provenance Model Gregory Todd Williams 1 | Why does the OPM allow for distinct alternative accounts of provenance? Given that the provenenace data is meant to describe what actually occurred to produce the data, why are two alternate accounts a sensible modeling? I would think the example in Figure 5 would be better modeled as representing "add1ToAll" as being comprised of the cons/split provenance DAG, since they represent the exact same provenance, but at different levels of granularity. Is there an example where this wouldn't make sense? The so-called "alternatives" in figure 16 might just as well be described as being two parts of the same provenance chain -- ("Execute program" triggers a procedure "Produce Sky Mosaic", etc.). However, this brings up a question as to whether processes and artifacts are disjoint. In figure 16, "Enactor Executable" is shown as an artifact, but is presumably also a process (perhaps the same process as "Produce Sky Mosaic"). How can these two paths through the provenance DAG be sensibly interpreted as alternative causality chains? Are we to conclude that the "Execute Program" alternative does not have a causal dependence on the "FITS DataSet" artifact, even though it exists in our provenance data? | Gregory Todd Williams | |
| James Michaelis Provenance Model Jesse Weaver | On page 5, Definition 8 states: "An edge 'was derived from' between two artifacts A1 and A2 indicates that artifact A1 may have been used by a process that derived A2." This definition seems surprisingly weak. The phrase "A1 may have been used by a process that derived A2" seems to indicate that A1 may not have been used by a process that derived A2. Therefore, the edge "was derived from" is not necessarily causal. Upon further investigation, it seems that the wording of the definition may have been chosen to satisfy inference 5 on page 17 to cover the case in which A1=A2 and, indeed, there may be no process in the derivation. However, wouldn't a more precise wording of Definition 8 be more appropriate? Perhaps maybe something like: An edge "was derived from" between two artifacts A1 and A2 indicates that artifact A1 was used by a process that derived A2, or that A1 and A2 are the same artifact. | Jesse Weaver | |
| James Michaelis Provenance Model Joshua Shinavier 1 | More of a comment than a question, but as a formal description of the Open Provenance Model, this paper contains a lot of material which seems unclear, unnecessary or inconsistent. For example, the "rules" of the Provenance Graph definition: in rules 2 through 5, what does it mean to "list the accounts" an item belongs to? Rule 12 (union and intersection of legal account views) is redundant as it follows from previous rules. Does rule 15 imply that an OPM graph with only one account view cannot be a provenance graph (because that would require a legal pair of views)? Rule 17 (provenance graphs do not need time annotations) seems superfluous as it has already been stated that time information is optional. Section 5 (Timeless Formal Model) is similar. In rule 8, what does the union or intersection of two graphs mean, when a graph is defined as a tuple, not as a set? Apart from the style of the paper, the fact that the authors do not try to explain why this particular model was chosen, what it is good for, is frustrating. | Joshua Shinavier | |
| James Michaelis Provenance Model Joshua Taylor 1 | I recognize that this is a more formal technical report, and not a conference paper. As such, it is fitting that the authors jump almost immediately into their formalism. I am not particularly familiar with the general or particular needs of provenance tracking systems. What are some such needs? Does the authors' system address these? While they have certainly addressed some of their desiderata from 1 Introduction, it is not clear that they "allow provenance information to be exchanged between systems, by means of a compatibility later based on a shared provenance model." The authors provide a model that anyone can adopt, but is it a model that matches existing systems? | Joshua A. Taylor | |
| James Michaelis Provenance Model Joshua Taylor 2 | Inference rule (1) from 6.1 One Step Inferences would seem to have some unintuitive, at least to me, consequences. For instance, it is easy to imagine extending Figure 2 (which depicts John controlling the process Bake which uses a number of ingredients and produces Cake) by adding a Make Fork process controlled Factory Worker, using Steel and Fork Mold, and produces Fork. Then, Susan might control an Eat Cake that uses Cake and Fork, and produces No Cake (or perhaps, Empty Plate). It seems that Inference Rule (1) would allow us to conclude that Eat Cake was triggered by Bake, and that Eat Cake 'was triggered by Make Fork. Recall Definition 7 (Process Triggered by Process) A connection of a process P2 to a process P1 by a "was triggered by" edge indicates that the start of process P1 was required for P2 to be able to complete. Are there any provisions for multiple possibilities? E.g., Eat Cake does not really depend on a particular instance of Make Fork, just about any Make Fork would do. Could this formalism handle something like recognizing that a codex was written over at some time between 1237 and 1342 without knowing by whom and exactly when? Could it be extended to handle such cases? | Joshua A. Taylor | |
| Moreau2007open question 1 by lebo | It is clear that the intent of this paper is to introduce the start of a common model for provenance and NOT to motivate the use of provenance systems. The need for a common model for any mutual interest is self-evident. However, some motivation for the use of provenance would be helpful.
|
Tim Lebo |
Attendees
Facts about James Michaelis Provenance ModelRDF feed

