Characterizing quality for a science data product is hard. We have been working on this issue in our Multi-Sensor Data Synergy Advisor (MDSA) project with Greg Leptoukh and Chris Lynnes from the NASA Goddard Space Flight Center (GSFC). The following is my opinion on what product quality means and how it can be characterized. This work was presented as a poster at the AGU FM 2011 meeting.
Science product quality is hard to define, characterize, and act upon. Product quality reflects a comparison against standard products of a similiar kind, but it is also reflective of the fitness-for-use of the product for the end-user. Users weigh quality characteristics (e.g. accuracy, completeness, coverage, consistency, representativeness) based on their intended use for the data, and therefore quality of a product can be different based on different users’ needs and interests. Despite the subjective nature of quality assertions, and their sensitivity to users fitness-for-use, most quality information is provided by the product producer and the subjective criteria used to determine quality is opaque, if available at all.
If users are given product quality information at all, this information usually comes in one of two forms:
- tech reports where extensive statistical analysis is reported on very specific characteristics of the product
- in the form of subjective and unexplained statements such as ‘good’, ‘marginal’, ‘bad’.
This is either information overload that is not easy for the user to quickly assess or a near lack of the type of information that a user needs to make their own subjective quality assessment.
Is there a smilar scenario in common-day life where users are presented with quality information that they can readily understand and act upon?
There is, and you see it every day in the supermarket.
Nutrition Facts labels provide nutrition per serving information (e.g. amount of Total Fat, Total Carbohydrates, Protein) and how the the listed amounts per serving compare to a perspective daily diet.
The comparison to a standard 2,000 calorie diet provides the user with a simple assessment tool for the usefulness of food item in their unique diet. Quality assertions, such as whether this food is ‘good’, or ‘bad’ for the consumer’s diet are left to the consumer – but are relatively easy to make with the available information.
A ‘quality facts’ label for a scientific data product, showing computed values for community-recognized quality indicators, would go a long way towards enabling a nutrition label-like presentation of quality that is easy for science users to consume and act upon.
We have begun working on mockups of what such a presentation of quality could look like, and have constructed a basic quality model that would allow us to express in RDF the information that would be used to construct a quality facts label.
Our quality model primer presents our high-level quality model and its application to an aerosol satellite data product in detail.
Our poster presentation was a hit at AGU, where we received a great deal of positive feedback on it. This nutrition label-like presentation is immediately familiar, and supports the metaphor of science users ‘shopping’ for the best data product to fit their needs.
We still have a long way to go on developing our presentation, but the feedback from discussions at AGU tells me that our message resonated with our intended audience.