Printer-friendly version

Semantics + Numerics

The ability to model and inform analyses of numerical datasets using semantics poses significant advantages over traditional programming methods, which are typically unaware of the interactions of the various components that constitute the methodology. This means that data provenance is also lost, as there is no record of the type of analysis performed. Adding semantics allows the developer to accomplish more with far less in many ways:

  • Ontology-based inference and result/code composition (see True Data-Driven Analysis section)

  • Linked data approach to incorporating a wide array of datasets

  • Markup of results from and semantic modeling of structured but unenriched datasets

  • Easy generalization to other datasets irrespective of domain or discipline

  • Tracking provenance of transformations as the results move through the SemNExT pipeline

Services and Containers

SemNExT applications are organized as a mixed bag of local containers and remote services that communicate with each other using standard protocols and encodings, namely HTTP, SPARQL, JSON-LD and RDF. There are certain architectural benefits associated with structuring a SemNExT application this way:

  • Architecture is highly modularized, allowing services to be restarted or even replaced without bringing down the entire stack

  • Communication between services is simplified, requiring minimal internal transformation of data beyond translations from RDF to JSON-LD

  • Services are isolated and retain control of how they are used by client applications, preventing potential exploits and further abstracting their use from users of the top-level SemNExT API

We use a variety of different tools in order to realize this implementation, most of which are listed under the Dependencies section.

True Data-Driven Analysis

The way SemNExT leverages its internal ontology is what makes it unique as an analysis framework. The power of inference allows SemNExT to compose results without hardcoding the way most frameworks expect the developer to. This makes SemNExT an intelligent service and more than the sum of its parts.
NOTE: This will be the next iteration of the service that will begin development in Fall 2016 and see further evolution in 2017. At the moment, ontological inference is used exclusively to reason about results within the triplestores the service connects to.

  • Semantic representations of SemNExT objects are reasoned over, providing an approach to systemic recombination of code chunks and classes

  • RDF results are automatically handled and fed to Python with minimal work for the developer

There are many potential design options that we are considering in order to implement this extension, including semantic workflow markup of the code, a functional programming-inspired approach, and others. To be continued...