SESDI Key Concepts
From Semantic Portal Wiki
{{Moved to drupal|url=http://tw.rpi.edu/web/project/SESDI_Key_Concepts }}
| SESDI |
| Project Information |
|
Description |
| Research |
|
Key Concepts |
| Design |
|
Use Cases |
| Software & Services |
| Community |
Objectives and Expected Significance
Overall vision for SESDI: To integrate information technology in support of advancing measurement-based processing systems for NASA by integrating existing diverse science discipline and mission-specific data sources.
This vision will be achieved using a set of technologies that feature rich semantics, that is, the precise meaning of a quantity or entity, e.g. a variable in a dataset, its units, a physical feature or phenomenon in the Earth system, how it may relate to other entities, its quality and lineage, etc. Semantics differ from syntax that refers to declarative information without regard to its specific meaning, e.g. the name of a variable, its data type, dimensions, etc. Great effort has been expended in building interfaces for a science user to search for a familiar 'name' which may or may not mean anything to him/her or assist in what a returned quantity actually is and how is can be used. Modern data formats and interfaces such as HDF, netCDF, CDF. WMS, FITS and geoTIFF allow for syntax and semantics to be recorded within data files, (typically as 'attributes'). However, interaction with the data provider is often required (e.g. 'contact the mission PI if you wish to use these data for science purposes!') before meaningful use of the data can be made, either within or across disciplines. This concept is known as pragmatics (what a quantity can be used for, and not). Experience and tools from the development of ontologies in engineering, manufacturing and online commerce will be utilized in this regard.
We will provide an equivalent level of interoperability for semantics that the Open Network for Data Access Protocol (OPeNDAP) provides for syntax.
Specifically, we will:
- Evolve an existing ontology ¿ the Semantic Web for Earth and Environmental Terminology (SWEET) - to broaden and deepen its subject matter, service coverage, and common sense knowledge, such as events and features.
- Apply a registry supported by an extended GEON ontology that leverages and extends technology developed for the GEON cyberinfrastructure to register selected NASA and non-NASA datasets.
- Enable a range of semantically-based data services (such as data mining, validation, data integration, etc.) based on definitions of science and service concepts within in the ontologies.
- Submit the ontology for review through the Standards and Processes Data Systems Working Group and obtain review from appropriate experts and potential users.
- Demonstrate SWEET, GEON, OPeNDAP and CEDARWEB technologies in support of the NASA ACCESS science and technology objectives.
Support for Evolution to Science Measurement Processing Systems
To support interoperability across a broad range of science areas, measurement-based processing systems (ComPS) require a fundamental shift away from solving the interoperability problem at the syntactic level toward addressing it at a semantic level. The significant and successful efforts at the syntactic level (e.g. DODS) within discipline areas provides a solid basis upon which an evolution toward semantic interoperability.
SESDI will enable greater interoperability and data flow between distributed data sources and data distribution portals. SESDI will knowledge-enable web services leading to intelligent handling of routine data access and processing tasks by science and applications user. SESDI will also leverage OPeNDAP (candidate for NASA ESE-RFC-004) server and client capabilities, adding semantics so that datareduction tools will allow users to subset, aggregate and otherwise retrieve only the data that is required for science of application purposes without the need for knowing the organization or stored format of the datasets. The addition of a semantic framework and semantic aware data registration, catalogs and access and retrieval tools are fundamental in the movement away from Instrument Processing Systems to Measurement Processing Systems.
Data and Information Systems Support for Science Focus Areas and Applications
Information technology tools have served the science and user communities quite well, provided that these communities stay within their disciplines and provided data systems. A long-standing challenge however has been to bridge this ¿digital divide¿ (e.g. Riverdeep, 2002; Novak, 1998) and make key improvements to services and tools. SESDI will enable data integration across disciplines by leveraging and augmenting existing technologies, allowing interdisciplinary science and user communities¿ access to data and tools that currently require domain-specific, data-element level knowledge, or access to teams of researchers with the requisite experience. In turn, SESDI will provide a bridge that connects scientists to data and services furthering research goals for NASA¿s Science Focus Areas. In turn, this connection maps directly to the Climate Variability and Change, Earth Surface and Interior and Sun-Earth Connection NASA roadmap activities (see http://science.hq.nasa.gov/strategy/roadmaps/).
In particular SESDI addresses the NASA objective to:
Conduct a program of research and technology development to advance Earth observation from space, improve scientific understanding, and demonstrate new technologies with the potential to improve future operational systems;
SESDI will provide a means of interconnecting two or more major Earth system components/drivers under the following strategic science focus areas for NASA: to explore interactions among the major components of the Earth system and to distinguish natural from human-induced causes of change, with the potential to understand and predict the consequences of change. SESDI also adopts one of NASA's Strategic Principles to address the science challenges with an "end-to-end" framework approach that includes observation, research and data analysis, modeling, and scientific assessment in collaboration with NASA and community partners that can be instantiated for specific science problems.
As a demonstration of this capability we will address such overall questions as: 'What are the primary causes of the earth system variability'?; 'What trends in atmospheric constituents and solar radiation are driving global climate'?; 'What can we learn by asking science questions of our interdisciplinary datasets'?
Science Data Integration - 'bridging the discipline data divide'
Ontologies can be used in a multitude of ways to create tools that do more for scientists. For a more detailed discussion, see (McGuinness, 2003). Since ontologies provide a way of specifying precise structured definitions of terms, applications may utilize these definitions to support: smarter retrieval, comparisons between terms that will highlight structural differences and similarities, consistency checking, and general deductive closure operations - i.e., deducing the implications of statements and making those implications explicit.
For example, if we find that there are multiple emerging vocabularies for climate, we will include terms from the emerging standard vocabularies and include descriptions of how the terms interrelate (sometimes called articulation axioms). This would allow one volcanologist who prefers one vocabulary to interact with volcanologists (and the data generated by these scientists) using another vocabulary. Further, it allows scientists to communicate with others who have different backgrounds since the terms and definitions can be used to automatically translate terms and can also be exposed to teach interested end-users about the differences. Also, it allows scientists with varying levels of domain expertise to interact since varying levels of abstraction can be used for experts and novices and additional support may be made available for novices. It provides the foundation for interfaces that may present information in the vocabulary, level of abstraction, mode (teaching, summary, etc.) preferred by the end-user.
One example of this example of a knowledge-enhanced user interface that allowed end users to access multiple data sets using multiple interface options is available in (Brachman, et. al., 1993). While this was not accessing scientific data, it did use ontologies to support enhanced querying, data integration, and multiple levels of customized output.
It would also allow interfaces to present the data in the vocabulary of choice for the end user. We can also use the descriptions to compare terms and identify when one is strictly the same or more specific than another, or identify when they overlap. We can use these descriptions to easily support structural comparison algorithms.
Additionally, we can use specifications of interrelationships to check to see if the data is following the descriptions. Simple example of this include integrity checks - a program can check to see if a number is specified in the right range or is of the right type. More complicated examples include descriptions of predicted changes and then programs automatically checking to see if the data is consistent with the predictions.

