Provenance
From Semantic Portal Wiki
| ||
Contents |
Overview
For general definition of provenance, see wikipedia article wikipedia:Provenance. Following are some definitions from several publications:
- The process that led to some data is called the provenance of that data. A provenance architecture is the software architecture for a system that will provide the necessary functionality to record, store and use process documentation to determine the provenance of data items. ( Miles et al. 2007)
- The motivation for understanding the provenance of works of art is also also applicable to data we see on the Web. With the proliferation of data on the Web, questions such as Where did this data come from?, Who else is using this data?, and Why is this piece of data here? are becoming increasingly common.( Tan 2004).
- Data provenance, one kind of metadata, pertains to the derivation history of a data product starting from its original sources. It is a moot point on where the boundary between provenance information and generic metadata lies. In some cases, there is little to distinguish the two and provenance is subsumed into the general metadata infrastructure. ( Simmhan et al. 2005)
Research Themes
Workflow Provenance
Workflow provenance has emerged as an important consideration in e-science (Lanter 1990; Frew and Bose 1991) and the grid community (Foster et al. 2002; Muniswamy-Reddy et al. 2006; Moreau and Ibbotson 2006). In order to address requirements from e-science areas (Miles et al. 2007), workflow provenance research focuses on process by recording the history of data derivation . The increasing interests in workflow provenance from different domains using different technologies have led to several provenance representation dialects, for example, the 14 teams in the second provenance challenge used their own (distinct) provenance representations and it was difficulty to integrate provenance metadata in different representation. For more general overview of workflow provenance, see some useful surveys (Simmhan et al. 2005; Bose and Frew 2005).
- researchers
- resources
- International Provenance Workshop (IPAW). http://www.ipaw.info/
- EU Provenance Project (service, grid), http://www.gridprovenance.org/
- Zoom Project: http://zoomuserviews.db.cis.upenn.edu/cgi-bin/pmwiki.php
- metaK - Meta Knowledge in Semantic Web Applications: http://isweb.uni-koblenz.de/Research/MetaKnowledge
- http://wiki.esi.ac.uk/ProvenanceInWorkflows, Symposium on Provenance in Scientific Workflows, October 13-17 2008
- references
REF general {{#vardefine:pagename|a survey of data provenance in e-science }}
- [[]]{{#vardefine:pagename|lineage retrieval for scientific data processing: a survey }}
- [[]]{{#vardefine:pagename|provenance-aware storage systems }}
- [[]]{{#vardefine:pagename|the eu provenance project: enabling and supporting provenance in grids for complex problems (final report) }}
- [[]]{{#vardefine:pagename|the requirements of using provenance in e-science experiments }}
- [[]]{{#vardefine:pagename|examining the challenges of scientific workflows }}
- [[]]{{#vardefine:pagename|chimera: a virtual data system for representing, querying, and automating data derivation }}
- [[]]
REF domain (GIS) {{#vardefine:pagename|earth system science workbench: a data management infrastructure for earth science products }}
- [[]]{{#vardefine:pagename|lineage in gis: the problem and a solution }}
- [[]]
REF domain (BIO) - bioinformatics process is a specific branch of workflow provenance. {{#vardefine:pagename|bioinformatics process management: information flow via a computational journal }}
- [[]]{{#vardefine:pagename|biopipe: a flexible framework for protocol-based bioinformatics analysis }}
- [[]]{{#vardefine:pagename|developing a protocol for bioinformatics analysis: an integrated information behavior and task analysis approach }}
- [[]]
Data Provenance (database)
Data provenance has been pioneered by (Buneman et al, 2001; Cui et al. 2000; Woodruff and Stonebraker 1997) within database community. Data provenance research focuses on issues of importance in database settings and has been inspired by computational methods suitable for and facilitated by databases. For example, why provenance finds source tuples to explain why a tuple is derived, and where provenance finds the portion of sources which is copied to a portion of the derived tuple. This kind of provenance can be recognized as a specialized workflow step whose action can be recorded by declarative query and declarative inverse-function. It is notable that some data provenance has been generalized to workflow provenance in e.g. e-science while the narrow "data provenance" remain in database domain. For more detailed overview, see some useful surveys (Glavic and Dittrich 2007; Tan 2007).
researchers
resources
- Principles of Provenance (PrOPr), http://www.cis.upenn.edu/~plclub/propr/
- a nice tutorial on data provenance: http://www.soe.ucsc.edu/~wctan/papers/2007/DBProvenance.ppt
- a biographic database website for Data Provenance maintained by Karen Renaud: http://www.dcs.gla.ac.uk/~karen/Provenance/
References {{#vardefine:pagename|why and where: a characterization of data provenance }}
- [[]]{{#vardefine:pagename|tracing the lineage of view data in a warehousing environment }}
- [[]]{{#vardefine:pagename|supporting fine-grained data lineage in a database visualization environment }}
- [[]]{{#vardefine:pagename|research problems in data provenance }}
- [[]]{{#vardefine:pagename|data provenance: a categorization of existing approaches }}
- [[]]{{#vardefine:pagename|provenance in databases: past, current, and future }}
- [[]]
Knowledge Provenance (AI)
Knowledge provenance (McGuinness and Pinheiro da Silva 2004; Fox and Huang 2003) focuses on issues of importance in knowledge base settings, which typically includes those of importance in database settings but also includes concerns arising from reasoning (potentially hybrid reasoning). For example, applications may need provenance for results of text analytic programs that are integrated into knowledge bases and processed by first order reasoners (Murdock et al. 2006) Provenance in distributed information systems (Weitzner et al. 2006) is an interesting direction in provenance research. Unlike many e-science workflows that simply compose services in to a sequence, the workflow in such systems involves many interactive communication protocols as well.
References {{#vardefine:pagename|explaining answers from the semantic web: the inference web approach }}
- [[]]{{#vardefine:pagename|knowledge provenance }}
- [[]]{{#vardefine:pagename|explaining conclusions from diverse knowledge sources }}
- [[]]{{#vardefine:pagename|transparent accountable data mining: new strategies for privacy protection }}
- [[]]{{#vardefine:pagename|pml 2: a modular explanation interlingua }}
- [[]]
Research Directions
Provenance Metadata
- reference information (aka digital object, statements)
- reference and classify entities involved in information manipulation
- annotate provenance attributes
- represent information manipulation process in terms of plan and log
Provenance Computation
- classify the computation on provenance metadata
- list application domain and scenarios for provenance
- provenance metadata management (storage, access, query)
- provenance aware user interaction
Provenance Systems
Literature Survey
OWL/RDFS Provenance Ontology
The following Semantic Web Provenance Ontologies are
- evaluated using Semantic Web Ontology Analysis Techniques
- compared by Statistics of OWL/RDFS Provenance Ontology
1. Open Provenance Model (OPM) v1.1 is encoded using
- OPM OWL Ontology: http://github.com/lucmoreau/OpenProvenanceModel/raw/master/elmo/src/main/resources/opm.owl
- OPM official: http://openprovenance.org/model/opm.owl
- OPM Tupello: http://twiki.ipaw.info/pub/Challenge/OpenProvenanceModelBindings/opm.owl
- OPM RPI: http://www.cs.rpi.edu/~michaj6/provenance/PC3OPM.owl
2. Proof Markup Language (PML) v2.0 consists of three modular ontologies
- PML Provenance OWL Ontology - http://inference-web.org/2.0/pml-provenance.owl
- PML Justification OWL Ontology - http://inference-web.org/2.0/pml-justification.owl
- PML Trust OWL Ontology - http://inference-web.org/2.0/pml-trust.owl
3. XMDR
- Provenance Vocabulary Core OWL Ontology http://purl.org/net/provenance/ns.rdf
- Provenance Vocabulary Types OWL Ontology - http://purl.org/net/provenance/types.rdf
- Provenance Vocabulary Integrity Verification OWL Ontology - http://purl.org/net/provenance/integrity.rdf
5. Provenir (Kno.e.sis Center, Wright State University, USA)
6. OBO Foundry
- OBO Relation Ontology in OWL - http://obofoundry.org/ro/ro.owl
- Information Artifact Ontology in OWL. http://information-artifact-ontology.googlecode.com/svn/releases/2009-11-06/merged/iao-main.owl
- OBO Metamodel Ontology in OWL http://www.geneontology.org/formats/oboInOwl
7 Basic Formal Ontology (BFO) http://www.ifomis.org/bfo
9 Semantic Publishing Ontology (signature)
- Semantic Publishing RDFS Ontology - http://www4.wiwiss.fu-berlin.de/bizer/WIQA/swp/swp-2.n3 -- it is RDF/XML
10 Dublin Core
- Dublin Core Terms RDFS Ontology - http://purl.org/dc/terms
- Dublin Core Element Set RDFS Ontology - http://purl.org/dc/elements/1.1/ (covered by DC terms)
12 Web of Trust (WOT)
14. Ontology Design Patterns a huge collection of modular ontologies
15. WGS84 Geo Positioning Ontology
16. iCal ontology
18 ORE (Open Archives Initiative Object Reuse and Exchange)
19. RSS Event Ontology -- the ontology is offline
20. Workflow Driven Ontology (WDO)
21 OWL-S Ontology
22 BioPAX Ontology
23 FOAF
24 Biological Processes Ontology
Provenance Vocabulary Specifications from US Government
- PREMIS http://www.loc.gov/standards/premis/schemas.html
- Intelligence Community Standard for Source Reference Citation Metadata http://www.dni.gov/ICIS/dsca/sc/ics_sc.htm
- DDMS http://metadata.dod.mil/mdr/irs/DDMS/
References
- Survey: semantic web and provenance
- http://www.w3.org/2005/Incubator/prov/wiki/Relevant_Technologies
{{#vardefine:pagename|the open provenance model }}
- [[]]{{#vardefine:pagename|pml 2: a modular explanation interlingua }}
- [[]]
| Dcterms:created | 2009/05/18 |
| Dcterms:creator | Li Ding + |
| Dcterms:description | A survey of provenance research frontier |
| Dcterms:modified | 2010-3-28 |
| Dcterms:relation | Provenance + |
| Foaf:name | Provenance |
| Skos:altLabel | Provenance + |

