Experiences Integrating Temporal Metadata in a Domain Ontology

The Virtual Solar-Terrestrial Observatory (VSTO) Portal at vsto.org provides a set of guided workflows to implement use cases designed for the VSTO Project. Semantics are used in these workflows to abstract instrument and parameter classifications, providing data access to users without extended domain specific vocabularies. The temporal restrictions used in the workflows are made possible with data availability tables in a SQL-based metadata catalog. We propose an alternative architecture design for the VSTO Portal, where the temporal metadata is integrated tightly with the rest of the domain ontology. We achieve this integration by creating time instances representing the coverage of datasets for a given project or observatory; as an example, we use the Coupling Energetics and Dynamics of Atmospheric Regions (CEDAR) database. There was initially a concern that the available tools for dealing with OWL ontologies and Semantic Web data were too primitive and could not scale the millions of time instances that would be generated for the CEDAR datasets. This paper formally addresses that concern by evaluating the performance and scalability of various well-known tools in representing and querying over the CEDAR knowledge base with time instances. The data collected tests the scalability of the Jena Semantic Web Framework in performing SPARQL queries over a memory model, as well as the feasibility of performing SPARQL queries over a remote Virtuoso triple store. The SPARQL queries have been designed around tasks that arise in the guided workflows of the VSTO Portal. Preliminary results have shown that the use of the RESTful service has far better performance than querying over a memory model; querying over a triple store may actually boost performance. We were somewhat surprised by the latter result; however, improving performance was not one of the primary motivations in transitioning from the dependence on the RESTful service.

View Publication

Associated Projects

VSTO is a collaborative project between the High Altitude Observatory and Scientific Computing Division of the National Center for Atmospheric Research and McGuinness Associates. VSTO is funded by a grant from the National Science Foundation, Computer and Information Science and Engineering (CISE) in the Shared Cyberinfrastructure (SCI) division.