Semantic eScience Framework

Printer-friendly version
Research Areas: Data Science, Semantic eScience, X-informatics, Data Frameworks
Principal Investigator: Peter Fox
Co Investigator: Jim Hendler and Deborah L. McGuinness
Concepts: Xinformatics, Semantic Faceted Browse/Search, Linked Data, Controlled Vocabulary, Semantic Web, Data Visualization, Virtual Observatory, Ontology, Software Framework, eScience, Computer Science, Use Cases, Information Model, Data Science, Information Retrieval, Semantic Web Services
The goals of this effort is to design and implement a configurable and extensible semantic eScience framework. Configuration will require some research into accommodating different levels of semantic expressivity and user requirements from use cases. Extensibility will be achieved in a modular approach to the semantic encodings (i.e. ontologies) performed in a community setting, i.e. an ontology framework into which specific applications all the way up to communities can extend the semantics for their needs.

Over the past few years, semantic technologies have evolved and new tools are appearing. Part of the effort in this project will be to accommodate these advances in the new framework and lay out a sustainable software path for the (certain) technical advances. In addition to a generalization of the current data science interface, we will include an upper-level interface suitable for use by clearinghouses, and/or educational portals, digital libraries, and other disciplines.

SESF builds upon previous work in the Virtual Solar-Terrestrial Observatory. the The VSTO utilizesleading edge knowledge representation, query and reasoning techniques to support knowledge-enhanced search, data access, integration, and manipulation.It encodes term meanings and their inter-relationships in ontologies anduses these ontologies and associated inference engines to semantically enable the data services. The Semantically-Enabled Science Data Integration (SESDI) project implemented data integration capabilities among three sub-disciplines; solar radiation, volcanic outgassing and atmospheric structure using extensions to existingmodular ontolgies and used the VSTO data framework, while adding smart faceted search and semantic data registrationtools. The Semantic Provenance Capture in Data Ingest Systems (SPCDIS) has added explanation provenance capabilities to an observational data ingest pipeline for images of the Sun providing a set of tools to answer diverseend user questions such as Why does this image look bad?.

In keeping with our developed semantic web methodology which has proven to applicable across disciplines and end-user levels of expertise, we fully engage members of the academic and broader communities via a series of workshop which includes existing U.S. national and international focused science programs, semantic technology communities of practice both national and international academic as well as world-wide agency research and implementation efforts. The workshops will span involvement from end-science and non-specialist use, to data system developers, knowledge modelers and ontology developers through to software engineers and tool/application developers. The proposed eScience framework is based on semantics, and software built on and around the semantics. A sustainability path for communities is essential so that use may continue into the future. Sustainability extends beyond software to ontology development and vetting, and communities of practice (scientist, data providers, technical teams). Via extensive community outreach we will be to facilitate the culture change that is required in scientific endeavors to sustain the implementation, evolution and viability of an eScience capability as essential as observing and experimental equipment, supercomputing and high-speed networking infrastructures are now. That culture change requires a demonstration of the value of the outcomes that eScience (data-driven) delivers which is part of our evaluation criteria.


NSF Office of Cyberinfrastructure