Why We Need To Get Smart About Data To Be Better Stewards: Making Smarter Virtual Observatories

Printer-friendly version

Authors:Peter Fox


For some time, the term cyberinfrastructure has been used to describe the collected set of computer-based software and services utilized to support and advance science, engineering, and education. Over the past decade there has been an evolution of that cyberinfrastructure aimed at enabling more robust and virtual, distributed scientific research set in a context of an interdisciplinary virtual observatory, i.e. using cyberinfrastructure to enhance data stewardship. Early efforts, in for example, the Virtual Solar- Terrestrial Observatory (VSTO) applied the then emergent semantic web data frameworks to provide production access to observational datasets from solar-terrestrial physics. That observatory provides virtual access to highly distributed and heterogeneous datasets in a common way, i.e. they appear to be organized, stored, accessed and used as if they were local or in other words, not exposing the underlying organization of the data holdings. More recently, an even larger science community, oriented around carbon in Earth has implemented a Deep Carbon Virtual Observatory (DCVO). The DCVO integrates (rather than develops) an existing set of application cyberinfrastructure, i.e. data and information stores, catalogues, collaboration and network tools, and application services. In this paper we indicate the development of the embedded semantic technology within the DCVO and indicate future directions.


DateCreated ByLink
July 28, 2015
Peter FoxDownload

Related Projects:

DCO-DS LogoDeep Carbon Observatory Data Science (DCO-DS)
Principal Investigator: Peter Fox
Co Investigator: John S. Erickson and Jim Hendler
Description: Given this increasing data deluge, it is clear that each of the Directorates in the Deep Carbon Observatory face diverse data science and data management needs to fulfill both their decadal strategic objectives and their day-to-day tasks. This project will assess in detail the data science and data management needs for each DCO directorate and for the DCO as a whole, using a combination of informatics methods; use case development, requirements analysis, inventories and interviews.
DCO-DS LogoVirtual Solar Terrestrial Observatory (VSTO)
Principal Investigator: Peter Fox
Co Investigator: Deborah L. McGuinness
Description: VSTO is a collaborative project between the High Altitude Observatory and Scientific Computing Division of the National Center for Atmospheric Research and McGuinness Associates. VSTO is funded by a grant from the National Science Foundation, Computer and Information Science and Engineering (CISE) in the Shared Cyberinfrastructure (SCI) division.

Related Research Areas:

Data Frameworks
Lead Professor: Peter Fox
Description: None.
Data Science
Lead Professor: Peter Fox
Description: Science has fully entered a new mode of operation. Data science is advancing inductive conduct of science driven by the greater volumes, complexity and heterogeneity of data being made available over the Internet. Data science combines of aspects of data management, library science, computer science, and physical science using supporting cyberinfrastructure and information technology. As such it is changing the way all of these disciplines do both their individual and collaborative work.

Data science is helping scienists face new global problems of a magnitude, complexity and interdisciplinary nature whose progress is presently limited by lack of available tools and a fully trained and agile workforce.

At present, there is a lack formal training in the key cognitive and skill areas that would enable graduates to become key participants in escience collaborations. The need is to teach key methodologies in application areas based on real research experience and build a skill-set.

At the heart of this new way of doing science, especially experimental and observational science but also increasingly computational science, is the generation of data.

Semantic eScience
Lead Professor: Peter Fox
Science has fully entered a new mode of operation. E-science, defined as a combination of science, informatics, computer science, cyberinfrastructure and information technology is changing the way all of these disciplines do both their individual and collaborative work.
As semantic technologies have been gaining momentum in various e-Science areas (for example, W3C's new interest group for semantic web health care and life science), it is important to offer semantic-based methodologies, tools, middleware to facilitate scientific knowledge modeling, logical-based hypothesis checking, semantic data integration and application composition, integrated knowledge discovery and data analyzing for different e-Science applications.
Partially influenced by the Artificial Intelligence community, the Semantic Web researchers have largely focused on formal aspects of semantic representation languages or general-purpose semantic application development, with inadequate consideration of requirements from specific science areas. On the other hand, general science researchers are growing ever more dependent on the web, but they have no coherent agenda for exploring the emerging trends on the semantic web technologies. It urgently requires the development of a multi-disciplinary field to foster the growth and development of e-Science applications based on the semantic technologies and related knowledge-based approaches.