SPCDIS Project Architecture

Printer-friendly version

SPCDIS is a work in progress!

While the architecture of the final SPCDIS software stack is still being developed, check out the technologies that we will be using to build SPCDIS at the Technology Infrastructure page.

Capture of Provenance

The idea is to insert provenance capture/logging statements into a data pipeline. This can range from the actual capture of the data by a sensor, transfer of the raw data from the capture site to wherever the data is processed, each processing step of the data, versioning information, and more.

Once this provenance information is captured it can then be inserted into a triple store, queried over, and displayed to users. This might include the conversion of the log files if they are not already in an RDF format. A conversion can take place if we want to generate different forms of provenance, such as OPM, PML, or the new Provenance model that is being generated within W3C.

RDF Modeling


Java/Jena Component

In the IDL component (see above), plain text logs of processing are generated. The Java/Jena component is designed to take the 9 logs generated by the IDL component, produce a corresponding RDF record and upload it to a Virtuoso instance.

The code in question (as well as the libraries needed by the code) can be downloaded at http://www.cs.rpi.edu/~michaj6/CoMP_RDF_GENERATOR/.