Reflections on Provenance Ontology Encodings

Printer-friendly version


As more data (especially scientific data) is digitized and put on the Web, the importance of tracking and sharing its provenance metadata grows. Besides capturing the annotation properties of data, provenance research also emphasizes interlinking relevant data. Therefore, it is desirable to make provenance metadata easy to access, share, reuse, integrate and reason with. To address these requirements, ontologies can be of use to encode expectations and agreements concerning provenance metadata reuse and integration. The Web is of use to support access and sharing. The Semantic Web, with its languages for representing terms and their descriptions, such as RDFS and OWL, is of use for capturing expectations, agreements, and meaning. We are investigating best practices for providing Semantic Web encodings for provenance ontologies by analyzing a selection of popular Semantic Web provenance ontologies such as Open Provenance Model (OPM), Dublin Core (DC) Terms, and the Proof Markup Language (PML). In this paper, we will highlight a few findings which include: (i) similarities and differences among existing provenance ontologies; (ii) popular approaches used to model provenance concepts and lessons learned from the usage of Semantic Web language features in representing provenance concepts; (iii) expressivity and tractability of representative provenance ontologies. The outcome of our study provides not only guidance to provenance ontology users but also insights to promote better collaborative provenance ontology development and scalable processing of provenance ontologies.


DateCreated ByLink
July 19, 2011
Jie BaoDownload
July 18, 2011
Patrick WestDownload

Related Projects:

Inference Web Project LogoInference Web
Principal Investigator: Deborah L. McGuinness
Description: The Inference Web is a Semantic Web based knowledge provenance infrastructure that supports interoperable explanations of sources, assumptions, learned information, and answers as an enabler for trust. Provenance - if users (humans and agents) are to use and integrate data from unknown, uncertain, or multiple sources, they need provenance metadata for evaluation Interoperability - more systems are using varied sources and multiple information manipulation engines, thus increasing interoperability requirements Explanation/Justification - if information has been manipulated (i.e., by sound deduction or by heuristic processes), information manipulation trace information should be available Trust - if some sources are more trustworthy than others, trust ratings are desired The Inference Web consists of two important components: Proof Markup Language (PML) Ontology - Semantic Web based representation for exchanging explanations including provenance information - annotating the sources of knowledge justification information - annotating the steps for deriving the conclusions or executing workflows trust information - annotating trustworthiness assertions about knowledge and sources IW Toolkit - Web-based and standalone tools that facilitate human users to browse, debug, explain, and abstract the knowledge encoded in PML.

Related Research Areas:

Knowledge Provenance
Lead Professor: Deborah L. McGuinness
Description: Knowledge Provenance
Concepts: Provenance,