VSTO Broad Design

Printer-friendly version

Involve the users right from the start and build a prototype

VSTO targets a broad disciplinary community with a broad scope (data, tools and education). To meet the needs of the user, it is essential to build a VSTO prototype driven by a requirements gathering process that includes source providers (data, models, presentations, etc.), content managers, users and technology providers/experts.

Use-case driven requirements will translate user requirements into an effective system.

The focus will be on the HAO science data, models, tools, etc. community of collaborators in gathering the user requirements. We have identified a test-bed data collection (CEDAR, CISM and MLSO) and associated tools and educational materials, to design and build the first instance of the VSTO.

What the user sees

We plan to design and implement a fully functional VSTO Web Portal. This web-based interface will provide a single point integrated initial access to the distributed holdings (powered by metadata catalogs), irrespective of the details of their format or organization. This interface will be Grid-enabled and thus will be compatible and interoperate with those being developed by SCD for access to general NCAR data archives (e.g. the Community Data Portal (CDP)). We will integrate these technologies into existing user tools for analysis of SSTSP data, e.g. such as the IDL-based SolarSoft package and plan to make them widely available to the community (providing web plug-ins or run-time software libraries for users to use with their applications).

Bringing solar datasets up to par

As noted, one key need for SSTSP data and tools is the development of a semantic schema (using OWL/RDF-DAML) for the VSTO holdings which may be based, for example, on simpler syntactic-only schema and key elements of SPDML/ESML.

Solar and solar-terrestrial data (from both models and observations and their analyses) reside in a variety of formats and OPeNDAP will be utilized in achieving the desired level of network and user transparency to these formats (and their organization).

Data Holdings

To populate the VSTO we will use the extensive solar and solar-terrestrial physics data and model archive managed by HAO (with some initial data/models in bold). This includes the Advanced Coronal Observing System (ACOS), the Advanced Stokes Polarimeter (ASP), and the accompanying Community Inversion Codes (CIC), the Precision Solar Photometric Telescope (PSPT), SunRISE solar spectra synthesis models, solar activity indices, Experiment for Coordinate Helioseismological Observations (ECHO), STellar Astrophysics and Research on Exoplanets (STARE), CISM models, TIME-GCM models, Assimilative Mapping of Ionospheric Electrodynamics models (AMIE), the entire CEDAR database, and HAO's education/outreach materials based on existing web materials, as well as others.

Basic Functions

Some of the basic functions that the prototype will support are: publishing data and tools, specifying (data) requests, extracting and transferring data, monitoring requests, updating educational materials, identifying users, and tracking use. To address these functions, we will implement services for metadata, process management and workflow, data access and transport and collaboration services.

We will also work to integrate and improve analysis tools which may require development of client interfacing, and adaptation or development of web portals with application tools.

An overview of Candidate Technologies

The `Grid' (http://www.globus.org) is a joint effort of Argonne National Laboratory, the Information Sciences Institute of the University of Southern California, and the University of Chicago, has been working for several years to solve problems faced by a number of science communities. The Grid is emerging as one piece of the cyberinfrastructure needed to further the next generation of IT-enabled science. To date, several programs are utilizing an adaptation of the Grid-enabled (Foster et al. 1999, 2001) approach to data systems which has proven successful in a variety of disciplines (e.g. in high energy and nuclear physics http://www.ppdg.net and http://www.griphyn.org).

A Grid-enabled virtual observatory minimizes the time to make data available and usable by capitalizing on the distributed nature of the available intellectual resources (data, etc.). Data does not have to be moved or reformatted, only registered with the catalog. It is then available from the VSTO web portal or the user's preferred application which has access to the VSTO interfaces. At the same time, experience with the Earth System Grid indicates that lighter weight (thin) clients need to be supported in addition to the fully Grid-enabled (thick) clients.

NSF Middleware Initiative (NMI, http://www.nsf-middleware.org/)

The Earth System Grid (ESG, http://www.earthsystemgrid.org/, Bernholdt et al. 2005) is an environment that harnesses the combined potential of massive distributed data resources, remote computation, and high-bandwidth wide-area networks as an integrated resource for the climate research scientist. Among other functions, ESG addresses the deployment of an operational, network distributed, system for a broad user community which includes both desktop client and sophisticated web portal access.

The community data portal (CDP: http://dataportal.ucar.edu) is a collection of earth science datasets from NCAR, UCAR and UOP, and participating organizations in the research areas of ocean and atmospheric science, space weather and turbulence. It provides browse and search capabilities across these datasets, and offers application and visualization services operating on these datasets to provided value-added representations to a user.

We intend to leverage experience in the development of this portal for use with VSTO.

Based on key infrastructure for creating and sustaining effective knowledge networks from the Space Physics and Aeronomy Research Collaboratory (SPARC), the CompreHensive collaborativE Framework (CHEF) version 1.0 has recently been used to build the next generation of these successful collaboration systems at the University of Michigan. It provides a rich set of user interface components on which the VSTO portal can be based. It is our intention to provide the collaboration tools that users really need, i.e. the best and most useful parts of SPARC and others based on the user requirements.

CHEF's generality and extensibility is attested to and mandated by its involvement in a number of diverse collaborative environments, such as the next generation of the University of Michigan CourseTools and WorkTools, the NSF-funded Network for Earthquake Engineering Simulation (NEES), the DOE-funded Scientific Annotation Middleware (SAM) and Collaboratory for Multi-scale Chemical Science (CMCS) projects, and the Open Knowledge Initiative (OKI).

The Open source Project for a Network Data Access Protocol (OPeNDAP) is a internet capable data access and transport service for a wide variety of dataset formats. OPeNDAP transparently extends extant data access programming interfaces without application software modification, provides servers for several dataset formats and is readily extensible to support new ones, enables applications to access subsets of datasets, minimizing unnecessary network data movement and is deployable within existing web and grid architectures. The VSTO will realize remote data access and processing capabilities as provided by the OPeNDAP framework, and web browser and client based access to observational and model data.

COSEC is a new technology project from Lockheed Martin Advanced Technology has developed the concept of a Sensor Web Virtual Machine which utilizes semantic descriptions (based on knowledge representation techniques and DAML) of distributed web resources to build a truly distributed application, e.g. a program to analyze solar wind composition data running on one system combined with a suitable dataset made available behind a web server on a completely different system and the results sent to a web browser. This technology, and experience from developing it as part of HAO's participation in the VSO, will be a component of VSTO especially in the area of data assimilation.