SPCDIS Role Based Provenance Presentation
From Semantic Portal Wiki
Contents |
Participants
- James Michaelis (RPI)
- Patrick West (RPI)
- Quan Bai (CSIRO)
Overview
One goal of provenance is to provide users an understanding of the steps a system took to generate data products. Here, the level of detail captured by provenance becomes an important consideration. As detail is added, more questions can be hypothetically addressed. However, presenting significant provenance detail may also overwhelm end users, for one of two reasons: (i) the detail presented is irrelevant to their needs, or (ii) missing background knowledge is required.
Both of these challenges for data generated by the Mauna Loa Solar Observatory’s (MLSO) Advanced Coronal Observing System (ACOS). In ACOS, photometer-based readings are taken of solar activity and subsequently processed into data products consumable by end users. To fully understand these sequences of steps, background knowledge corresponding to various areas (e.g., astronomy, digital imaging, and ACOS specific techniques) is required by end users. This makes reviewing provenance difficult for users outside the ACOS development team, where varying degrees of background may be expected (ranging from outside domain experts in Solar Physics to citizen scientists).
Related Work
This work closely relates to a number of areas in the domains of provenance and workflow research:
Workflow Management Systems
Two workflow management systems have been identified which enable selective views of the provenance of data products.
Redux [1]: Present provenance based on four-layers of processing, taking place during the execution of a workflow:
- Abstract description of the experiment that captures abstract activities in a corresponding workflow.
- Instance of abstract model, which captures instances of activities and additional relationships, as classes of activities are instantiated.
- Information to trace the execution of the workflow, including input data, parameters supplied at runtime, branches taken, and activities inserted or skipped during execution.
- Runtime-specific information, such as the start and end time of workflow execution, start and end time of individual activity execution, status codes and intermediate results, information about the internal state of each activity, along with information about the machines where activities were allocated.
VisTrails [2]: Three layers:
- Workflow specifications
- Change histories for workflows
- Workflow executions
Use Cases
CHIP Pipeline
(TODO)
CSIRO
(TODO)
Mockups
CHIP Pipeline Interface
(TODO)
References
[1] Roger S. Barga and Luciano A. Digiampietri. Automatic capture and efficient storage of escience experiment provenance. Concurrency and Computation: Practice and Experience, 20(5), 2008. (doi: http://dx.doi.org/ 10.1002/cpe.1235).
[2] Carlos Scheidegger, David Koop, Emanuele Santos, Huy Vo, Steven Callahan, Juliana Freire, and Claudio Silva. Tackling the provenance challenge one layer at a time. Concurrency and Computation: Practice and Experience, 20(5):473{483, 2008. (doi: http://dx.doi.org/10.1002/cpe.1237).
