SPCDIS is a work in progress!
While the architecture of the final SPCDIS software stack is still being developed, check out the technologies that we will be using to build SPCDIS at the Technology Infrastructure page.
Pipeline software developed at HAO is driven by a language called IDL (Interactive Data Language). During Summer, 2011, James developed an API for logging information about data products generated by both the CoMP and CHIP pipelines. In its current version, the logging API is designed to support a mixture of: (i) high-level provenance recording, and (ii) annotation of high-level processes with kinds of processing activities occurring. In taking this approach, any maintenance of logging routines by HAO staff would be greatly simplified (versus maintaining logging routines for generating a low-level provenance trace).
I've defined the following components in the current IDL-based logging API:
log_artifact.pro - this logs files generated by CoMP
log_process.pro - this logs pipeline processes executed by comp, at the level of IDL scripts (e.g., an execution of Demod.pro)
log_activity.pro - this logs processing activities carried out by pipeline processes
log_observations.pro - this logs observations of the solar corona, made by CoMP
log_entity.pro - this logs data entities (i.e., the result of a CoMP observation)
log_dataset.pro - a set of data entities, gathered over the course of one processing day
log_qualityassertion.pro - assertions of a quality metric (and corresponding score), applied to a data entity
log_qualityevidence.pro - evidence used by a quality assertion
log_fitsheader.pro - header entries for FITS files
Each logged statement will have varying kinds of information included. However, they all have two things in common: a line number and an entry label (name). Additionally, each of the functions above depends on its own IDL common block, each of which has the following variables:
counter > current ID count (incremented for each statement logged)
NameRegistry > hash map for tracking ID-name artifact mapping
IDRegistry > hash map for tracking name-ID artifact mapping
LUN > Logical Unit Number for output file
Using the hash maps NameRegistry and IDRegistry, both ID and Name lookups can be done across IDL script executions, to help facilitate log entry referencing.
Finally, the following additional scripts are used to support the logging API:
comp_initialize.pro - this is a modified version of comp_initialize.pro, designed to open each of the log LUNs
init_logblocks.pro - this initializes each log block for the logging functions
stop_logger.pro - this closes each of the log LUNs
These IDL files can all be downloaded from http://www.cs.rpi.edu/~michaj6/IDL/Logger/.
In the IDL component (see above), plain text logs of processing are generated. The Java/Jena component is designed to take the 9 logs generated by the IDL component, produce a corresponding RDF record and upload it to a Virtuoso instance.
The code in question (as well as the libraries needed by the code) can be downloaded at http://www.cs.rpi.edu/~michaj6/CoMP_RDF_GENERATOR/.