The Changing Face of Visualisation in a World of Data Intensive Science

Printer-friendly version

Authors:Peter Fox


Electronic facilitation of scientific research is increasingly prevelant (including humanities) and is almost certainly an understatment. Among the consequences of new and diversifying means of complex (*) data generation is that as many branches of science have become data-intensive (so-called fourth paradigm), they in turn broaden their long-tail distributions - less complex data still produces excellent science. There are many familar informatics functions that enable the conduct of science (by specialists or non-specialists) in this new regime. For example, the need for any user to be able to discover relations among and between the results of data analyses and informational queries. Unfortunately, visual discovery over complex data remains more of an art form than an easily conducted practice. In general, the resource costs of creating useful visualizations has been increasing. Less than 10 years ago, it was assessed that data-centric science required a rough split between the time to generate, analyze, and publish data and the science based on that data. Today, however the visualization and analysis component has become a bottleneck, requiring considerably more of the overall effort and this trend will continue. Potentially even worse, is the choice to simplify analyses to 'get the work out'. Extra effort to make data understandable, something that should be routine, is now consuming considerable resources that could be used for many other purposes. It is now time to change that trend. This contribution lays out paths for visualization and analysis to be 'exploratory' and early in the conduct of science in addition to presentation modes, and is cast in the present reality of Web/Internet-based data and software infrastructures. In particular, three key actions are suggested and discussed. First, visualizers must work with tool designers to make sure that visualizations are sharable during the entire life span of the scientific process. Second, standardization of the workflow and linking technologies for scientific visualizations must be formalised and propagated into easy-to-use tools. Finally, joint effort is required to explore new ways of scaling easy-to-generate visualizations to data-intensive scientific pursuits upon common infrastructures. A logical consequence of this path is that the people working in this new mode of research, i.e. data scientists, require additional education to become effective and routine users of new informatics capabilities. One goal is to achieve the same fluency that researchers may have in lab techniques, instrument utilization, model development and use, etc. Thus, in conclusion, curriculum and skill requirements for data scientists will be presented and discussed. * complex/ intensive = large volume, multi-scale, multi-modal, multi-disciplinary, heterogeneous structure, and more.


DateCreated ByLink
November 8, 2011
Peter FoxDownload

Related Projects:

DCO-DS LogoStrawberry Fields Forever (SFF)
Principal Investigator: Peter Fox and Johannes Goebel
Description: The project addresses a key problem in Creative IT — the ubiquitous need for an integrative tool that allows rapid innovation and dissemination in new and interdisciplinary fields of research.

Related Research Areas:

Data Frameworks
Lead Professor: Peter Fox
Description: None.
Concepts: eScience
Data Science
Lead Professor: Peter Fox
Description: Science has fully entered a new mode of operation. Data science is advancing inductive conduct of science driven by the greater volumes, complexity and heterogeneity of data being made available over the Internet. Data science combines of aspects of data management, library science, computer science, and physical science using supporting cyberinfrastructure and information technology. As such it is changing the way all of these disciplines do both their individual and collaborative work.

Data science is helping scienists face new global problems of a magnitude, complexity and interdisciplinary nature whose progress is presently limited by lack of available tools and a fully trained and agile workforce.

At present, there is a lack formal training in the key cognitive and skill areas that would enable graduates to become key participants in escience collaborations. The need is to teach key methodologies in application areas based on real research experience and build a skill-set.

At the heart of this new way of doing science, especially experimental and observational science but also increasingly computational science, is the generation of data.

Concepts: eScience
Lead Professor: Peter Fox
Description: In the last 2-3 years, Informatics has attained greater visibility across a broad range of disciplines, especially in light of great successes in bio- and biomedical-informatics and significant challenges in the explosion of data and information resources. Xinformatics is intended to provide both the common informatics knowledge as well as how it is implemented in specific disciplines, e.g. X=astro, geo, chem, etc. Informatics' theoretical basis arises from information science, cognitive science, social science, library science as well as computer science. As such, it aggregates these studies and adds both the practice of information processing, and the engineering of information systems.
Concepts: , eScience